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Abstract 

This  thesis  proposes  a  timing  simulator  (RSiM)  based  on  a  uniquely  simple  transistor  model.  RSIM 
allows  a  designer  to  determine  both  the  functional  and  approximate  timing  characteristics  of  a  MOS 
network  with  more  accuracy  than  gate-level  simulation,  and  using  larger  circuits  than  are 
accommodated  by  circuit  analysis  programs.  In  RSIM,  transistors  are  modeled  as  resistors;  the  logic 
states  of  a  transistor’s  terminal  nodes  determine  its  effective  resistance.  Using  this  model,  a  MOS 
network  is  simulated  as  a  network  of  resistors  where  each  node’s  value  is  determined  by  the  resistance 
of  its  connections  to  various  inputs.  Transition  times  are  determined  from  the  RC  time  constant 
calculated  for  the  node  by  examining  the  surrounding  network;  (R  from  the  transistors,  C  from  the 
interconnect  and  gate  capacitance).  The  network's  behavior  as  inputs  are  given  values  is  calculated  by 
an  efficient  event-driven  algorithm. 

Two  changes  to  the  underlying  model  are  also  investigated: 

(1)  further  simplifying  the  transistor  model  to  an  on/off  switch  (which  can  be 
thought  of  as  a  degenerate  resistor).  Several  approaches  to  switch-level 
simulation  are  developed,  one  particularly  well-suited  for  implementation  using 
parallel  hardware. 

(2)  modeling  the  behavior  of  a  network  of  switches  by  a  system  of  logic  equations. 

Various  compilation  strategies  are  evaluated  for  producing  code  that  implements 
the  system  of  equations. 


Name  and  Title  of  Thesis  Supervisor: 

Stephen  A.  Ward, 

Associate  Professor  of  Computer  Science  and  Engineering 
Key  Words  and  Phrases: 

circuit  simulation,  logic  simulation,  timing  analysis,  CAD  tools 


i 

L 


-3- 


ACKN0WLKDGMKNT5 


Thanks  everybody: 

Steve  Ward 

Ron  Rivest 

Gerry  Sussman 

Bert  Halstead  Clark  Baker 

and  the  rest  of  RTS,  past  and  present 

Mark  Johnson 

Dave  Gross 

Jeff  Fox 

Doug  Williams 

Bob  Yodlowski 

Debbie  Cohn 

The  good  advice,  kind  words,  insight,  and  support  provided  over  the  years  by  these  fine  folks,  and 
others,  have  made  this  thesis  possible. 

This  research  was  supported  by  the  Advanced  Research  Projects  Agency  of  the  Department  of 
Defense  and  was  monitored  by  the  Office  of  Naval  Research  (Contract  Nos.  N00014-75-C-0661  and 
N00014-83-K-0125). 


-4- 


TABLK  OF  CONTF.NTS 

1.  Introduction  5 

1.1  Overview  of  the  thesis  6 

1.2  Outline  of  the  remaining  chapters  10 

2.  A  Linear  Network  Model  for  MOS  Simulation  12 

2.1  RSI  M's  transistor  model  13 

2.2  RSI  M's  node  model  16 

2.3  RSI  M's  network  model  21 

2.4  Calibrating  and  using  the  RS1M  model  28 

2.5  Summary  33 

3.  Justification  of  the  Linear  Network  Model  35 

3.1  Electrical  models  for  mosfets  and  gates  35 

3.2  Node  voltages  39 

3.3  Propagation  delay:  overview  43 

3.4  Propagation  delay:  logic  gates  44 

3.5  Propagation  delay:  source-followers  and  pass  transistors  54 

3.6  Implications  for  the  RSIM  model  58 

4.  Simulation  Using  a  Linear  Network  Model  61 

4.1  The  RSIM  simulation  algorithm  61 

4.2  Speeding  up  the  simulation  76 

4.3  Escape  mechanisms  80 

4.4  An  evaluation  of  RSIM  82 

5.  Simulation  Using  a  Switch  Network  Model  85 

5.1  Representing  node  values  85 

5.2  Developing  the  switch  model  92 

5.3  The  global  switch  model  94 

5.4  the  local  switch  model  106 

6.  Simulation  Using  a  Pre-compilcd  Network  Model  119 

6.1  Reducing  switch  paths  to  logic  equations  120 

6.2  Compiling  logic  equations  for  simulation  127 

7.  Conclusions  136 

Appendix  1.  Proof  of  Lemma  5.3  141 

Appendix  2.  RSIM  Calibration  Tables  for  a  5p  nMOS  Process  146 

Appendix  3.  Approximation  for  Resistor  Divider  and  Series  Resistor  150 


References 


155 


-  5- 


CHAPTER  ONE 

INTRODUCTION 


Simulation  plays  an  important  role  in  the  design  of  integrated  circuits.  Using  simulation,  a 
designer  can  determine  both  the  functionality  and  the  performance  of  a  design  before  the  expensive 
and  time-consuming  step  of  manufacture.  The  ability  to  discover  errors  early  in  the  design  cycle  is 
especially  important  for  MOS  circuits,  where  recent  advances  in  manufacturing  technology  permit  the 
designer  to  build  a  single  circuit  that  is  an  order  of  magnitude  larger  than  ever  before  possible.  This 
thesis  presents  three  new  algorithms  designed  specifically  for  the  simulation  of  large  digital  MOS 
circuits. 

Today's  MOS  circuits  offer  special  challenges  to  a  simulation  program,  challenges  that  arc  not  met 
very  well  by  current  simulators.  New  integrated  circuits  can  incorporate  hundreds  of  thousands  of 
transistors;  the  sheer  number  of  transistors  dictates  that  a  simulation  algorithm  use  simple, 
computationally  efficient  transistor  models.  In  addition,  designers  take  advantage  of  the  symmetry  of 
the  MOS  transistor  to  build  circuit  configurations  with  behavior  beyond  the  ken  of  traditional  logic 
simulators.  The  new  simulators  introduced  here  arc  designed  to  meet  these  challenges. 
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1.1.  Owtiew  of  the  thesis 

To  use  .1  simulator,  the  designer  enters  a  design  into  the  computer.  t>  picallv  in  the  form  of  a  list 
of  circuit  components  where  each  component  connects  to  one  or  more  nudes.  A  node  scr\cs  as  a  wire, 
transmitting  the  output  of  one  circuit  component  to  other  components  connected  to  the  same  node. 
I  he  designer  then  specifies  the  voltages  or  logic  levels  of  particular  nodes,  and  calls  upon  the  simulator 
to  predict  the  voltages  or  logic  levels  of  other  nodes  in  the  circuit.  The  simulator  bases  its  predictions 
on  models  describing  the  operation  of  the  components;  a  simulator  is  charactcri/ed  by  the  types  of 
component  models  it  employs.  Two  of  the  more  popular  approaches  are; 

•  component  models  based  on  the  actual  physics  of  the  component:  for  example,  a 
transistor  model  dial  relates  current  flow  through  the  transistor  to  tire  terminal 
voltages,  device  topology,  and  manufacturing  parameters  of  the  actual  device. 

•  component  models  based  on  a  description  of  'he  logic  operation  performed  by  the 
component,  e.g..  SAND  and  NOR  gates. 

The  first  type  of  model  is  found  in  circuit  analysis  programs  such  as  AST  a  I1  [Weeks73]  or  SLICE 
[Nagcl75]  which  try  to  predict  the  actual  behavior  of  each  component  with)  a  high  degree  of  accuracy. 
Current  circuit  analysis  programs  do  the  job  well,  perhaps  too  well;  at  no  small  cost,  they  provide  a 
wealth  of  detail,  at  sub-nanosccond  resolution,  about  the  voltage  of  each  node  and  the  amount  of 
current  through  each  device.  (For  example,  a  properly  calibrated  circuit  analysis  program  is  able  to 
predict,  within  a  few  per  cent,  the  amount  of  current  that  flow's  through  an  actual  transistor.)  Ihis 
level  of  detail  would  swamp  the  designer  if  collected  for  the  entire  circuit  while  simulating,  say,  a 
microprocessor.  Fortunately,  the  designer  is  spared  this  fate,  since  the  computational  cost  of  circuit 
analysis  restricts  its  applicability  to  circuits  with  no  more  than  a  few  hundred  devices. 

One  solution  to  the  problem  of  simulator  performance  is  to  adopt  a  simpler  component  model, 
such  as  the  gate-level  model  introduced  above.  This  approach  works  well  when  dealing  with 
implementation  technologies  that  adhere  to  gate-level  semantics  (r.g..  bipolar  gate  arrays).  However. 
MOS  circuits  contain  bidirectional  switching  elements  that  cannot  be  modeled  by  the  simple 
composition  of  Boolean  gates.  Since  many  of  the  circuit  techniques  that  make  MOS  attractive  for  tst 
and  vi  si  applications  take  advantage  of  this  non-gam  like  behavior,  it  is  important  to  model  such 
circuits  accurately. 

This  thesis  explores  the  possibility  of  providing  the  essential  information  (functionality  and 
comparative  timing)  for  large  digital  circuits  by  using  models  that  bridge  the  gap  between  the  gate- 
level  and  detailed  models  discussed  above.  The  goals  to  be  met  by  these  new  models  are  summarized 
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in  the  following  list: 

(i)  lhc  underlying  model  must  be  computationally  tractable  for  large  circuits.  The 
empirical  nature  of  the  verification  provided  by  simulation  suggests  that  it  must 
be  applied  extensively  if  the  results  arc  to  be  useful;  timely  simulation 
encourages  this. 

(ii)  Transistor-level  simulation  is  necessary  to  accurately  mode!  die  circuit  structures 
found  in  MOS  designs.  This  allows  the  designer  to  simulate  what  was  designed  — 
an  advantage,  since  requiring  separate  specification  of  a  design  for  simulation 
purposes  only  introduces  another  opportunity  for  error. 

(iii)  ITie  results  must  be  correct,  or  at  least  conservative;  a  misleading  simulation  that 
results  in  unfounded  confidence  in  a  design  is  probably  worse  than  no  simulation 
at  all.  Here,  we  must  trade  off  the  conflicting  desires  of  accuracy  and  efficiency. 

Two  models  arc  examined  in  detail  by  the  thesis: 

•  a  linear  model  in  which  a  transistor  is  modeled  by  a  resistance  in  series  with  a 
voltage-controlled  switch.  The  state  of  the  switch  is  controlled  by  the  voltage  of 
transistor's  gate  node. 

•  a  switch  model,  similar  to  the  linear  model,  except  that  resistance  values  are  limited 
to  one  of  two  quantities:  0  for  for  n-  and  p-channel  devices,  and  1  for  depiction 
devices. 

MOS  circuits  are  easily  transformed  to  use  either  model,  as  illustrated  by  the  following  figure. 


(a)  original  circuit  (b)  linear  model  (c)  switch  model 

Figure  1.1.  Two  approaches  to  modeling  a  simple  MOS  circuit 

The  linear  model  forms  the  basis  for  the  RSIM  simulator.  In  RSIM,  networks  of  transistors  and  electrical 
nodes  form  an  R-C  network  (R  for  the  transistors.  C  for  the  interconnect  and  gate  capacitance);  the 
network's  behavior  under  different  inputs  is  calculated  by  a  selective-trace  (event-driven)  algorithm. 
The  comparatively  fast  "pseudo  circuit  analysis"  that  is  possible  with  the  linear  model  allows  the 
designer  to  determine  both  the  functional  and  approximate  timing  characteristics  of  a  network.  RSIM 
goes  a  long  way  towards  meeting  the  three  goals  outlined  above.  The  algorithm  employed  to  estimate 
the  behavior  of  a  linear  network  is  much  faster  than  a  typical  circuit  analysis  program.  Resistors  are 
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inherently  bidirectional:  the  network  analysis  makes  no  a  prion  assumptions  about  the  direction  of 
current  flow  through  each  resistor.  Finally,  the  results  are  at  least  qualitatively  correct  and.  in  general, 
conservative  —  in  some  eases  more  conservative  than  designers  themselves  might  like.  With  the 
appropriate  choice  of  model  parameters,  the  results  can  even  be  quantitatively  useful. 

Ihe  switch  model  is  a  simplification  of  the  linear  model  that  is  useful  when  only  a  circuit's 
functionality  is  of  interest  (t.e..  no  information  on  performance  is  wanted).  Like  a  traditional  gate-level 
simulator,  a  switch-level  simulator  bases  its  predictions  on  an  abstraction  of  the  actual  circuit,  but  the 
sw  itch  model  is  able  to  handle  the  bidirectional  nature  of  MOS  transistors  much  more  successfully  than 
a  gate-level  model.  The  switch  model  is  incorporated  by  lsim.  a  simulator  that  has  seen  extensive  use 
it.  the  last  few  years. 

Certainly  a  major  goal  of  RS1M  and  ESIM  is  to  provide  a  fast,  useful  simulation  of  MOS  circ 
but  the  story  docs  not  end  there.  Another  motivation  for  new  simulation  algorithms  is  the  chang 
nature  of  the  design  community.  In  order  to  cope  with  the  increasing  complexity  of  integrated  cir 
design,  new  design  methodologies  have  developed  (e.g..  [McadSO])  that  impose  constraints  on  the  v 
circuits  arc  constructed.  One  can  no  longer  afford  to  hand-craft  each  transistor,  so  rules  of  thumb  are 
created  to  aid  in  the  choice  of  transistor  sizes.  Clever  circuit  configurations  arc  avoided  in  favor  of 
circuits  composed  under  the  guidance  of  composition  rules  (e.g..  [BcllSl])  that  rule  out  arbitrary  circuits 
and  the  obscure  electrical  behavior  they  imply. t 

These  new  design  methodologies  have  opened  up  the  field  of  IS!  design  to  a  new  breed  of 
"Mead  and  Conway"  designer.  i.e..  a  designer  who  is  a  sophisticated  architect,  but  who  is  not  a 
specialist  in  LSI  technology.  An  important  aspect  of  the  simulators  described  in  this  thesis  is  that  their 
underlying  models  arc  easily  understood  by  this  new  breed  of  designer,  Ihe  abstractions  embodied  by 
the  simulators  arc  faithful  enought  to  the  actual  electrical  behavior  of  a  circuit  that  the  achievement  of 
a  successful  simulation  run  indicates  freedom  from  a  large  class  of  potential  failure  modes.  If  a 
simulation  docs  point  out  an  error,  it  docs  so  in  a  manner  that  leads  even  the  novice  designer  to  a 
good  understanding  of  the  circuit  as  actually  designed  and  the  ways  in  which  it  might  differ  from  the 
intended  design. 

However,  the  simulators  are  based  on  models  of  actual  behavior.  As  v ith  any  model, 

tSlalc-of  ihcan  designs  intentionally  exploit  the  "obscure"  behavior  of  cerium  drains  (»■£.  sense  amplifiers),  often 
to  considerable  commercial  advantage  RSIM  and  its  lelatucs  are  not  as  useful  for  this  type  of  design  as  convention¬ 
al  circuit  analysis  programs  Hut  ihe  professionals  engaged  in  such  well-focused  designs  arc  not  the  audience  ad¬ 
dressed  by  Mead  and  Conway  (and  RSIM) 
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discrepancies  arc  likely  to  exist  between  die  model  predictions  and  die  actual  behavior  of  a  circuit. 
Ihc  tools  described  here  attempt  to  be  conservative,  i.e..  to  give  pessimistic  predictions,  but  this  cannot 
be  guaranteed.  Thus,  it  is  important  dial  die  designer  become  acquainted  with  the  inner  workings  of 
the  models  and  dieir  shortcomings.  The  tools  perforin  a  calculation  one  could  do  by  hand  (only  faster 
and  with  greater  accuracy  and  consistency)  —  diev  should  tmi  be  treated  as  black  boxes.  I  he  models 
presented  here  are  simple  enough  to  enable  any  designer  to  gam  die  necessary  understanding. 

A  final  motivation  for  new  simulation  technology  is  die  desire  to  improve  simulator  performance. 
It  seems  that  digital  computers  ought  to  be  well  suited  for  die  simulation  of  digital  logic. 
Unfortunately,  current  simulation  schemes  involve  several  layers  of  interpretation  (r.g..  command 
interpretation,  access  to  die  network  data  base,  model  evaluation),  and  dieir  performance  suffers  as  a 
result.  Happily,  much  of  this  overhead  can  be  eliminated  through  die  application  of  traditional 
compilation  techniques.  This  is  die  theme  of  the  final  section  of  die  diesis,  and  die  motivation  for  the 
development  of  CS1M,  a  combination  compiler/simulator,  csim  compiles  a  network  into  a  simulation 
subroutine;  the  subroutine  contains  code  to  compute  the  new  value  of  each  node  from  its  old  value 
and  the  values  of  other  nodes  in  the  network.  Ihc  compilation  is  particularly  easy  when  the  node  is 
the  output  of  a  logic  gate,  and  die  work  presented  here  extends  die  compilation  technique  to  any  node 
in  a  MOS  circuit.  Simulating  the  network  entails  executing  the  subroutine  repeatedly  until  no  nodes 
change  value.  If  the  circuit  is  very  active,  lc..  if  many  nodes  change  value  each  time  the  network  is 
simulated,  the  simulation  subroutine  computes  new  node  values  many  times  .aster  than  the 
corresponding  event-driven  simulation.  There  has  been  much  interest  recently  in  special  purpose 
hardware  for  simulation  |Pfistcr82.  ZycadS.T).  It  may  be  dial  such  developments  are  premature,  and 
that  substantially  better  simulation  performance  can  still  be  obtained  from  general-purpose  computers. 

Ihc  relationship  among  RSiM.  i  SIM.  and  CSIM  is  illustrated  in  the  table  below. 


RSIM 

HS1M 

CSIM 

node  values 

'  logic-level 

logic-level 

logic-level 

I  (from  voltages) 

1 

| 

model  level 

transistor 

!  transistor 

node  equations 

components 

resistors  & 

switches  & 

!  equations 

capacitors 

!  capacitors 

(from  switches) 

scheduling 

event-driven 

]  event-driven 

J  compilc-timc 

relative  speed  j  1 

|  .5  -  3 

1 

O 

o 

i 


No  one  simulator  has  a  speed  advantage,  for  reasons  explained  in  subsequent  chapters.  It  is  not 
unusual  to  use  all  dircc  simulators  during  the  course  of  a  design,  since  each  brings  out  a  different 
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aspect  of  a  circuit's  behavior.  I  SIM  is  often  used  during  the  early  stages  of  a  design  when  the  designer 
is  fleshing  out  the  logic.  KSIM  is  used  to  determine  which  portions  of  the  design  arc  in  need  of  a 
careful  performance  analysis;  usually  the  performance  of  most  of  the  circuit  can  be  debugged  with  the 
level  of  detail  provided  by  RSIM.  Finally.  CSIM  is  useful  for  long  simulation  runs  intended  to  verify  the 
functionality  of  the  design  through  extensive  diagnostics. 

This  thesis  presents  the  new  models  and  their  accompanying  simulators  in  detail,  exploring  the 
ramifications  of  each  model  and  discussing  the  accuracy  and  usefulness  of  their  predictions.  The  next 
section  gives  a  brief  outline  of  the  remaining  chapters. 


1.2.  Outline  of  the  remaining  chapters 

The  thesis  has  three  main  pans.  The  first  pan  focuses  on  the  linear  model  and  the  RSIM 
simulator. 

Chapter  2  description  of  the  switch/rcsistor  transistor  model  incorporated  by 
RSIM:  outline  of  the  method  for  calculating  a  node's  value  using  the 
linear  transistor  model;  propagation  of  changes  through  the  network; 
choosing  model  parameters;  analysis  of  sample  circuits  using  linear 
model. 

Chapter  3  justification  of  the  linear  model  by  analysis  of  true  behav  ior  of  MOS 
logic  gates;  comparison  of  actual  voltages  and  propagation  delays 
with  RSIM  s  predictions:  proposal  for  modifications  to  the  model 
based  on  insight  gamed  during  analysis:  analysis  of  sample  circuits 
using  updated  model. 

Chapter  4  details  of  converting  the  linear  model  into  a  workable  simulation 
algorithm:  optimizations  for  improving  simulator  performance; 
mechanisms  for  controlling  the  voltage  and  transition  time  predictions 
for  specific  nodes;  review  of  the  successes  and  failures  of  the  linear 
model. 

The  second  part  (Chapter  5)  presents  the  switch-level  model.  The  chapter  begins  with  a 
discussion  of  the  representation  of  node  values  and  explains  why  many  extant  simulators  adopt  a 
representation  that  leads  to  unnecessary  difficulties.  Next,  two  switch-level  algorithms  are  presented. 
The  first  is  a  straightforward  adaptation  of  the  RSIM  algorithm,  replacing  its  resistance  computations 
with  simpler  ones  that  reflect  the  resistance  value  constraints  of  the  switch  model.  The  second 
algorithm  is  based  on  an  entirely  different  approach;  each  computation  haralcs  a  single  transistor  and 
uses  only  local  information  (the  type  of  the  transistor  and  the  states  of  its  terminal  nodes).  The 
computation  is  easy  to  understand  and  appeals  to  our  intuition  about  the  way  transistors  really 
operate.  The  simulation  proceeds  by  repeatedly  computing  new  node  values  for  the  source  and  drain 


nodes  of  individual  transistors,  choosing  the  transistors  in  any  convenient  order.  The  simulation  is 
complete  when  no  further  changes  in  the  network  state  arc  possible.  The  termination  of  this 
relaxation  algorithm  is  proved,  and  the  final  network  state  is  shown  to  be  independent  of  the  order  in 
which  the  individual  computations  arc  performed.  The  second  algorithm  is  well  suited  for 
implementation  on  the  new  parallel  architectures  just  now  becoming  available;  the  approach  discussed 
here  is  a  first  cut  at  designing  simulation  algorithms  tailored  for  use  on  parallel  engines. 

The  third  part  (Chapter  6)  investigates  the  possibility  of  using  various  compilation  schemes  to 
improve  the  performance  of  the  switch-level  simulator.  A  technique  is  proposed  for  constructing  a  set 
of  equations  for  each  node  in  the  network.  These  equations  relate  the  new  value  of  a  node  to  its 
current  value  and  the  values  of  other  nodes  in  the  network.  The  network  can  be  simulated  by 
evaluating  each  node's  equations  in  turn;  several  ways  of  ordering  the  nodes  for  evaluation  are 
discussed.  The  section  concludes  with  several  examples  of  simulation  routines  that  were  compiled 
directly  from  the  network  data  base.  When  executed,  these  routines  result  in  a  simulation  several 
orders  of  magnitude  faster  than  otherwise  possible. 

The  thesis  concludes  with  a  discussion  of  other  work  in  the  area  of  simulation  and  its  relationship 
to  the  ideas  presented  here. 


CHAPTER  1AVO 


A  Linear  Network  Model  for  MOS  Simulation 


lire  electrical  model  described  in  Lhis  chapter  can  be  used  as  the  basis  for  a  logic-level  simulation 
of  a  network  of  MOS  transistors.  Other  models  are  of  course  possible,  ranging  in  accuracy  and  detail 
from  circuit  analysis  to  high-level  functional  simulation.  While  the  chosen  model  does  not  encompass 
many  of  the  operational  details  of  real  MOS  networks  (most  notably  ,  detailed  transistor  modeling)  it  is 
adequate  to  efficiently  determine  the  basic  functionality  and  the  approximate  timing  characteristics  of 
a  network.  Short  circuits,  charge  sharing,  nodes  with  multiple  drivers,  bidirectional  "pass"  transistors, 
and  so  on  arc  modeled  correctly. 

The  first  section  describes  the  sw itch/rcsistor  transistor  model  incorporated  by  rsim.  Using  this 
model,  a  MOS  network  is  simulated  as  a  resistor  network  where  each  node's  value  is  determined  by  the 
resistance  of  its  connections  to  various  inputs.  ITic  second  section  outlines  the  method  for  calculating 
the  value  of  each  node.  This  is  followed  by  an  explanation  of  the  use  of  component  models  to  predict 
tire  propagation  of  new  input  values  through  a  network.  The  fourth  section  discusses  techniques  for 
choosing  model  parameters  and  compares  rsim's  predictions  with  those  of  a  circuit  analysis  program. 
The  chapter  concludes  w  idt  a  summary  of  Ore  model's  ingredients. 
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2.1.  RSI  M's  transistor  model 

The  transistor  model  in  KSIM  can  be  quite  simple  since  it  is  only  used  to  predict  the  final  logic 
suite  of  a  node  and  the  length  of  lime  each  suite  transition  takes.  As  an  example  of  how  the  model 
works,  consider  a  simple  inserter:  one  can  think  of  the  effective  resistance  of  us  component  devices  at 
any  moment  as 

„  pullup  n  IJs : pull Jem n  /->  n 

Rtf  pullup  —  R tf  .pulldown  ~  ~  (2.1; 

vds : pullup  vJs  '.pulldown 

The  following  figure  shows  the  actual  effective  resistance  of  an  inverter's  pullup  and  pulldown  as  a 
function  of  the  inverter's  output  voltage  (assuming  no  load  current). 


vds:pullup 


vds:pul!down 


Figure  2.1.  Effective  device  resistances  in  an  inverter 

Although  the  effective  resistances  of  the  transistors  change  as  their  terminal  voltages  vary,  it  might  be 
possible  to  use  "average  channel  resistances"  to  characterize  the  transistors'  behavior. 

The  other  salient  feature  of  a  transistor's  operation  is  its  switch-like  behavior.  With  certain 
voltages  on  a  transistor's  terminal  nodes,  it  makes  no  connection  at  all  between  its  source  and  drain 
terminals  —  the  transistor  is  "off.  As  the  relative  terminal  volutges  change,  the  transistor  turns  "on", 
conducting  current  between  its  source  and  drain  terminals.  As  illustrated  in  the  previous  figure,  the 
transistor  is  more  "on"  at  some  times  than  others,  but  the  distinction  among  the  different  "on"  suites 
can  be  ignored  for  simplicity. 

There  arc  three  basic  types  of  transistor  switches  found  in  MOS  circuits: 


-  14  • 


drain 


source 


ON  when  gate  =  1 
OIT  when  gate  =  0 

(a)  n-channcl  switch 


dram 


source 


ON  when  gate  =  0 
Oi  l'  when  gale  =  J 

(b)  p-channel  switch 


dram 


source 
always  ON 


(c)  dcpleuon  switch 


Figure  2.2.  Three  types  of  MOS  transistor  switches 


The  difference  between  n-channel  and  p-channcl  switches  is  the  logic  level  which  turns  on  the  switch. 
The  depletion  switch  is  always  on;  it  is  usually  connected  to  vdd  in  a  way  that  provides  a  source  of 
current  to  keep  its  output  node  charged  high.  More  precise  distinctions  between  the  switch  types,  and 
the  need  for  a  depiction  device  (and  why  an  ordinary  switch  docs  not  suffice)  are  discussed  in  Chapter 
3. 

One  can  build  on  the  observations  made  above  to  construct  a  linear  transistor  model  for  rsim: 


drain 


6 

source 


drain 


r 


efT 


source 


open 

dosed 

unknown 


V 

gate 

'gale 

'gate 


=  0 
=  1 

=  unknown 


(a)  n-channcl  transistor 


(b)  RSIM  model 


Figure  2.3.  RSIM  model  for  n-channel  transistor 


It  is  easy  to  tabulate  the  sort  of  connection  that  exists  between  the  source  and  drain  terminals  as  a 
function  of  the  gate  voltage: 


Rds 


R cff  switch  closed  (vg0,r  =  1) 

00  switch  open  (vs,w=0) 

[Ro^r  oo]  switch  unknown  (vgatc  -  X) 


(2.2) 


Note  that  uncertainty  about  the  state  of  the  switch  leads  naturally  to  an  interval  describing  the 
resistance  of  the  source-drain  connection.  In  fact,  all  the  network  calculations  use  interval  arithmetic. 
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.ind  the  bounds  of  the  resulting  intervals  are  used  when  converting  voltages  to  logic  slates,  etc.:  no 
other  mechanisms  are  needed  to  deal  successfully  with  X  states  in  die  network.  Models  for  other 
types  of  transistors  differ  in  the  way  die  position  of  the  switch  is  determined  from  vKate: 


dram  drain 


source  source 


(a)  p-channel  transistor  model  (b)  depletion  transistor  model 

Figure  2.4.  RSIM  models  for  p-channel  and  depletion  transistors 

'Hie  efTecuve  resistance  Rcjf  is  determined  separately  for  each  transistor  and  depends  on 

width,  length  dimensions  of  die  active  transistor  area.  Various  non-linear  effects 
make  R,jj  a  more  complicated  function  of  the  transistor  geometry 
dian  just  length/width. 

type  Most  MOS  circuits  contain  more  than  one  type  of  transistor  Ihc 

different  types  arc  distinguished  by  different  values  for  their 
threshold  voltage.  Since  die  current  conducted  by  a  transistor  is  a 
function  of  its  threshold  voltage  and  hence  us  type,  die  modeling 
resistance  also  depends  on  the  transistor  type. 

context  Accuracy  in  choosing  the  effective  resistance  can  be  improved  by 

distinguishing  several  contexts  in  which  a  transistor  may  appear:  for 
example,  an  enhancement  transistor  can  be  used  as  a  pulldown  or 
source- follower  in  addition  to  its  default  pass  gate  configuration. 
Surprisingly  few  contexts  need  to  be  recogm/ed  to  encompass  a  large 
portion  of  digital  MOS  designs. 

Ihc  determination  of  R,ff  is  made  once  for  each  transistor  and  does  not  depend  on  any  dynamic 
properties  of  the  circuit  to  be  simulated.  During  simulation  die  only  device  information  rsim  uses 
about  a  transistor  is  its  cfTective  resistance. 

Actually  RSIM  uses  not  one.  but  three  effective  resistances  for  each  transistor,  lo  understand 
why.  recall  that  RSIM  tries  to  predict  the  transition  time  and  final  voltage  for  a  node,  as  shown  in  the 
billowing  figure. 
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Figure  2.5. 


is  used  lo  predict  transition  time  and  final  voltage 


One  would  like  to  calibrate  the  model  to  give  accurate  predictions  for  both  quantities,  but  that  is 
impossible  with  a  single  set  of  resistances.  To  solve  this  problem.  RSIM  uses  three  resistances  for  each 
transistor: 

Rsiauc  whcn  calculating  the  final  voltage. 

„/0l,  when  calculating  the  transition  time  for  high-to-Iow  transitions. 

Rjynhigh  "hen  calculating  the  transition  time  for  low-to-high  transitions. 

I  wo  "dynamic"  resistances  arc  used  so  that  the  asymmetric  behavior  of  pass  devices  can  be  accurately 
predicted.  Computations  involving  Rcjj  arc  triplicated,  one  for  each  of  the  three  actual  resistances,  so 
subsequent  calculations  can  use  the  appropriate  value. 


2.2.  RSIM's  node  model 

Voltages  in  this  model  arc  quantized  into  one  of  three  values;  this  corresponds  to  our  intuition 
for  digital  logic  and  greatly  simplifies  the  simulation  calculations.  If  all  node  voltages  arc  normalized 
to  fall  in  the  range  |0.  1).  then  the  possible  quantized  values  are 
0  logic  low  —  voltages  in  the  range  (0.  v/ow]; 

1  logic  high  —  voltages  in  the  range  [ vh,th ■  IJ'. 

X  intermediate  voltages,  (iy0B.  iy„s/,J.  or  unknown  voltages.  [0.  1]  —  to  be 
conservative  X  is  always  interpreted  as  representing  the  larger  interval; 

where  »•/„„  and  vy „gh  arc  the  predetermined  logic  thresholds. 

How  is  the  value  of  a  node  determined?  Using  the  transistor  model  described  in  the  previous 
section,  the  original  network  is  transformed  into  a  network  of  resistors  (formerly  transistors)  and 
capacitors  (formerly  nodes).  If  a  node  is  not  connected  lo  any  input,  it  is  said  to  be  charged  with  3 


i 
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logic  suite  determined  by  the  suite  of  the  hist  driven  node  it  was  connected  to.  If  two  or  more  charged 
nodes  in  different  logic  states  are  connected  then  charge  sharing  occurs.  In  this  ease,  all  the  connected 
nodes  reach  the  same  logic  state;  this  state  is  determined  by  die  rclatis e  capacitances  and  initial  logic 
states  of  the  nodes  in  the  suigc.  For  example,  if  a  large  (high  capacitance)  node  such  as  a  data  bus 
were  connected  by  a  pass  transistor  to  a  small  node  such  as  die  input  to  a  register  cell,  dicn  die  small 
node  would  "share"  the  charge  of  the  large  node  as  its  final  value  regardless  of  die  charge  it  had 
initially.  Even  nodes  that  ultimately  have  a  connection  to  an  input  participate  in  charge  sharing;  the 
extent  of  their  participation  is  governed  by  the  relative  si/es  of  the  charge-sharing  rime  constant  and 
the  time  constant  associated  with  the  input  connection. 


Electrically  connected  nodes  form  natural  groupings,  called  stages,  bordered  by  input  nodes 
(usually  VDD  and  gnd).  If  nodes  in  a  stage  are  allowed  to  share  charge,  all  will  reach  the  same 
voltage,  y share,  given  by 


i/  _  1  nodes 

'  share  :mm  —  - V* - 

2,  <7 

all  nodes 


y share: 


2  e  /  +  2  Ci 

1  nodes  X  nodes 


all  nodes 


(2.3) 


where  the  sums  arc  over  nodes  in  the  current  stage.  Since  nodes  at  logic  state  X  contribute  an 
undetermined  amount  of  charge  to  the  result,  V share  is  an  interval  whose  bounds  represent  the  worst 
ease  assumptions  about  the  actual  values  of  X  nodes.  These  bounds  are  compared  with  the  logic 
thresholds  when  calculating  the  charge-sharing  value: 


Charge- sharing  value 


0  y share  .max  ^  Vfow 

'  y share  :min  c  vhigh 

X  otherwise 


(2.4) 


This  calculation  is  not  strictly  accurate  when  the  stage  contains  transistors  with  gates  of  X.  Such 
transistors  might  not  make  any  connection  at  all;  invalidating  the  various  sums  in  equation  2.3.  An 
alternative  charge-sharing  calculation  that  addresses  this  problem  is  discussed  in  Section  4.1.1. 

When  one  accounts  for  the  resistance  between  nodes,  it  is  difficult  to  calculate  transition  times 

for  any  nodes  that  change  value  because  of  charge  sharing.  RSIM  simply  schedules  any  charge-sharing 

transitions  so  they  happen  immediately.  A  more  reasonable  time  constant  might  be  where 

/ 

the  first  term  is  the  sum  of  all  the  resistances  in  the  stage  and 
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2  * 

0  and  A  nodes 

G#  =  2  O 

1  and  X  nodes 

0 


Charge- sharing  value  =  1 
Charge-sharing  value  =  0 
otherwise 


(2.5) 


is  the  amount  of  capacitance  in  the  stage  that  needs  to  be  charged/discharged  to  reach  the  charge- 
sharing  value.  This  time  constant  is  surely  an  upper  bound  on  the  time  of  any  transition  in  the  stage. 
Note  that  transitions  to  X  still  happen  immediately,  a  conservative  assumption. 

If  a  stage  is  connected  to  one  or  more  inputs,  the  inputs  determine  the  final  voltage  of  each  node 
in  the  stage.  The  effect  of  inputs  on  a  particular  node  is  characterized  by  the  Thcvenin  equivalent  for 
the  stage  (including  the  inputs  at  the  boundary),  regarding  the  given  node  as  the  output: 


Figure  2.6.  Equivalent  circuit  for  a  network  node 

V'hcv  a  voltage  interval  [  T  _ .  T  +  ]  in  the  range  [0.  1]  specifying  the  possible  voltages 
the  output  node  may  have.  This  value  is  calculated  using  each  transistor’s 
R sianc  resistance. 

Rdnvc  a  resistance  interval  [/?  _.  /?+]  in  the  range  [0.  oo].  Three  versions  of  this 
value  arc  calculated:  Rdnvr.b «•  using  Rd’.nlo*  for  each  transistor;  R dn\c -.high, 
using  Rdvniugh  ■  and  Rdmc.x  (sec  section  4.1.2).  The  appropriate  version  is 
chosen  depending  on  the  final  voltage  predicted  by 

C,hev  and  R dnve  arc  generally  intervals,  since  the  effective  transistor  resistances  from  which  they  arc 
derived  might  themselves  lie  in  an  interval.  Chapter  4  describes  how  V,)iev .  Cioad,  and  R drive  are 
estimated  for  nodes  in  actual  networks. 

It  is  sometimes  useful  to  categorize  a  node  according  to  its  equivalent  R drive <  Le..  how  it  alTccts 

neighboring  nodes  to  which  it  becomes  connected  by  conducting  transistors: 

input  ( R drive  =  0).  Node  is  a  designated  input  node  (e.g.,  von  orGNO).  The  value  of 
input  nodes  can  only  be  changed  by  explicit  simulator  commands:  the  assumption  is 
that  inputs  supply  enough  current  to  be  unaffected  by  connections  (possibly  shorts  to 
other  inputs)  made  by  transistors. 
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driven  {Rjrlu  <  °°).  Node  is  part  of  a  voluigc  divider  between  two  inputs,  i.e..  it  is 
connected  by  transistors  to  other  driven  or  input  nodes.  Driven  nodes  can  affect  the 
value  of  charged  nodes  without  being  affected  themselves,  hut  may  be  forced  to  an  X 
suite  if  shorted  to  a  driven  or  input  node  that  has  a  different  logic  level. 

charged  ( Rjrm-  =  Node  is  connected,  if  at  all.  only  to  other  charged  nodes. 

Until  reconnected  to  some  other  part  of  the  network,  charged  nodes  maintain  their 
current  logic  state  indefinitely  (charge  storage  with  no  decay). 

If  Rjrnc  is  infinite,  equation  2.4  predicts  the  correct  final  value  for  the  node  and  no  further  work  is 

needed.  If  Rjrm  <  00 .  and  the  node  is  not  an  input,  the  final  state  of  a  driven  node  is  calculated  from 

the  I’thev  interval  {F_,  F  +  ]: 


Final  value 


0  V  +  <  vlow 

I  V-  ^  vhigh 

X  otherwise 


(2.6) 


As  an  example,  consider  several  different  states  of  a  NOR  gate: 


Figure  2.7.  Equivalent  circuits  for  a  NOR  gale  with  different  inputs 


ffhev  — 


1 

Ri 

R\  +  R  2 

.  *2  II  *3  *2 

‘*1  +  (*2  II  R  3)’  R 1  +  R  2 


ftgure2.1(b) 

ftgurel.l(c) 

fgurel.l(d) 


(2.7) 


If  the  final  value  of  a  node  differs  from  its  charge-sharing  value,  then  the  appropriate  event  is 
scheduled  Rtg  Ceff  seconds  in  the  future,  where 


Riff  = 


R  drive:  high 
R drive  :hw 
R drive  :  x 


final  value  =  1 
final  value  =  0 
final  value  =  X 


(2.8) 
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ftna!  value  ~  1 

final  value  =  0  (2.9) 

final  value  =  X 

where  the  sums  arc  computed  for  nodes  in  the  current  stage.  Note  that  transitions  to  X  arc  not 
immediate,  but  have  a  time  constant  related  to  the  fastest  transition  the  node  can  make.  This  means 
that  a  momentary  short-circuit,  such  as  that  shown  in  the  following  figure,  docs  not  necessarily  cause  a 
node  to  become  X;  what  happens  depends  on  the  relative  si/es  of  the  various  time  constants. 


Cejf  = 
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Figure  2.8.  A  momentary  short-circuit  does  not  necessarily  cause  an  V  value 

If  the  delay  through  the  inverter  is  small  compared  to  the  time  constant  of  the  output  node,  no  X 
transition  will  be  processed  for  the  output  node  (one  is  scheduled,  but  is  aborted  when  the  pullup 
turns  ofT). 

To  better  understand  the  interaction  between  the  charge-sharing  and  final-value  calculations, 
consider  the  following  example: 


K? 

Figure  2.9.  Sample  circuit  for  charge-sharing  and  final-value  calculation 

Assuming  that  Cp  is  initially  charged  low  and  that  charge  sharing  happens  immediately  (an 
assumption  rsim  makes),  die  re  arc  several  different  scenarios: 

(  A  «( It  node  A  goes  low  immediately  because  of  charge  sharing  with  B.  Then. 
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bath  nodes  are  driven  high  h>  the  pullup  —  node  A  .it  time 
R  i(('j  +  (  h  ).  and  node  H  at  time  (/?  i  +  Rz)i( 'a  +  (  B  )■ 

(a»(  h  node  U  goes  high  immediately  bee  a  use  of  charge  sharing  with  A;  the 
pullup  has  nothing  to  contribute. 

CA  ~  Of  both  A  and  li  go  to  X  immediately  and  arc  then  pulled  up  with  die  same 
time  constants  as  for  (a  «Cb- 

If  /f  s  is  reasonably  smaller  than  R],  then  the  assumption  that  charge  sharing  happens  quickly  is  valid, 
and  these  scenarios  arc  satisfactory.  As  R j  approaches  R  >  in  value,  the  time  constants  associated  with 
charge  sharing  approach  those  of  the  pullup.  and  the  assumption  of  immediate  charge  sharing  is  a 
relatively  poor  onc.t  Augmenting  the  charge  sharing  calculation  as  described  in  equation  2.5  would 
improve  the  prediction  in  this  case. 

In  summary,  calculating  a  node’s  value  involves  two  separate  computations,  each  of  which  can 
generate  a  new  event: 

(1)  a  charge-sharing  event  describing  an  immediate  change  in  the  node's  state  caused 
by  the  redistribution  of  charge  among  the  capacitors  for  nodes  in  die  current 
stage.  'ITiis  type  of  event  is  generated  when  two  stages  are  merged  ( i.e .,  a 
transistor  turned  on). 

(2)  a  final-value  event  describing  what  die  final,  driven  state  of  the  node  will  be. 

This  type  of  event  is  generated  when  #jmr  <  oo. 

Chapter  4  describes  the  way  diese  two  events  are  reconciled  with  each  other  and  with  pending  events 
to  produce  a  final  set  of  transitions  for  a  node. 


13.  RSIM's  network  model 

The  nctworkst  simulated  by  RSIM  are  made  up  of  two  basic  components: 

(i)  electrical  nodes  which  sene  as  wires.  Each  node  has  a  capacitance  that  is  the 
sum  of  two  contributions:  (1)  capacitance  between  other  layers  and  the 
conducting  layers  that  make  up  the  node:  and  (2)  capacitance  from  the  gate 
junctions  formed  by  the  node. 

(ii)  three-terminal  transistors  (mosfets)  which  act  as  switches.  Each  transistor 
conditionally  connects  two  nodes  (called  the  source  and  drain  of  the  transistor) 
depending  on  the  voltage  of  the  third  node  (called  the  gate  of  the  transistor). 

Some  nodes  (c.g..  vnn  and  G\n)  arc  designated  as  inputs  that  supply  the  current  needed  to  change  the 

tfhis  illustrates  the  asymmetry  between  the  limine  of  transitions  due  lo  charge  sharing  and  those  due  lo  ihc  final 
value  calculation  u\.  K?  affecis  only  the  final  value  transition  This  anomaly  could  be  exploited  to  produce  rather 
bi/arrc  predictions,  e. y. .  a  node  changes  faster  if  it  is  connected  to  a  capacitor  than  if  it  is  connected  to  an  input'  As 
a  practical  matter,  circuit  performance  seldom  depends  on  the  timing  of  charge- -sharing  transitions,  and  these 
anomalies  are  not  significant 

fVtwoils  can  be  entered  as  schematics  [ I ennanS.1)  or  extracted  from  layout  information  [ItakerSO]  Ihc  latter  ap¬ 
proach  provides  fairly  accurate  estimates  of  the  capacitance  of  each  node 
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voltage  of  a  node  by  charging/discharging  the  node's  capacitor.  As  the  voltage  of  a  node  changes, 
switches  controlled  by  the  node  open  or  close,  making  connections  that  cause  the  voltages  of  other 
nodes  to  change.  It  is  RSiM's  job  to  predict  die  dynamic  behavior  of  a  network  of  nodes  and  switches, 
estimating  the  voltage  of  each  node,  the  slate  of  each  switch,  and  the  chargc/dischargc  rate  when  a 
node  changes  value.  From  the  designer's  point  of  view,  this  translates  into  knowledge  about  the  logic 
level  of  each  node  and  the  transition  umc  associated  with  each  change  of  logic  level. 

It  is  easy  to  build  switch  configurations  that  compute  simple  logic  functions  of  node  values.  For 

example: 


(a)  constant  1  (b)  nMOS  inverter  (c)  cMOS  inverter 

Figure  2.10.  Examples  of  switch  configurations  that  perforin  logic  operations 

The  output  node  in  figure  2.10(a)  is  connected  to  a  depletion  switch  configured  as  a  current  source:  its 
value  is  always  a  logic  high.  Such  circuits  arc  called  pullups  because  their  output  nodes  are  always 
"pulled-up"  to  logic  high.  In  figure  2.10(b)  a  "pulldown"  switch  has  been  added,  controlled  by  node 
A.  The  pulldown  is  sized  so  that,  when  it  is  on,  it  conducts  more  current  than  the  pullup  supplies. 
When  A  is  1,  the  output  node  is  "pulled-dow  n"  to  0.  Of  course,  when  A  is  0.  the  pulldow  n  is  off  and 
the  pullup  ensures  that  the  output  is  1;  the  net  result  is  an  inverter  circuit.  Figure  2.10(c)  is  an 
inverter  constructed  from  one  p-channel  and  one  n-channcl  device.  Typically,  the  manufacturing 
process  can  provide  either  p-channel  devices  or  depletion  devices,  but  not  both,  in  the  same  circuit. 
More  complicated  logic  circuits  arc  constructed  using  scries  and  parallel  switch  configurations. 
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drain  drain 


source  source 

(a)  connection  if  (A  or  B)  (b)  connection  if  (A  and  B) 

Figure  2.11.  Logic  functions  associated  with  series  and  parallel  configurations 

If  ihc  two-switch  circuits  shown  above  replace  the  pulldown  in  figure  2.2(b),  the  result  is  a  two-input 
nor  or  nand  gate. 

In  all  the  circuits  presented  so  far.  the  inputs  arc  electrically  isolated  from  the  outputs.  Le.,  if  the 
output  signal  is  corrupted  somehow  —  by  a  short  circuit,  for  example  —  the  input  signals  arc 
unaffected.  Hie  isolation  provided  by  the  gate  connection  leads  to  a  natural  decomposition  of  the 
network  into  stages  made  up  of  nodes  and  transistors.  Nodes  belong  to  different  stages  only  if  they 
are  guaranteed  to  be  electrically  isolated.  For  example,  in  the  following  circuit,  nodes  A.  B.  C.  and  D 
arc  all  isolated  from  one  another.  Node  H  is  not  isolated  from  D.  so  it  is  in  the  same  stage  as  D. 


inputs  outputs 


Figure  2.12.  Simple  circuit  that  has  three  stages 


Note  that  vdd  and  gnu  (and,  in  fact,  any  input)  arc  not  treated  as  nodes  in  die  ordinary  sense  when 
checking  to  sec  if  two  nodes  belong  to  the  same  stage.  For  example,  node  B  is  not  considered  to 
connect  to  node  C  by  a  path  involving  GM)  and  two  of  die  pulldown  transistors.  Given  a  pardcular 
node,  a  tree  walk  of  the  network  is  performed  to  find  all  other  nodes  in  the  stage.  The  tree  w  alk  first 
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locates  all  "on"  switches  which  have  a  sourcc/drain  connection  to  the  original  node.  Nodes  connected 
to  the  drain/sotircc  of  those  switches  arc  part  of  the  same  stage  as  the  original  node,  llic  tree  walk 
continues  until  it  locates  all  nodes  that  can  be  reached  from  the  original  node  by  a  path  of  "on" 
switches;  this  set  of  connected  nodes  and  the  "on"  transistors  that  form  the  connections  make  up  a 
single  stage.  Note  that  the  decomposition  of  the  network  into  stages  is  a  dynamic  process,  ue.,  one 
that  depends  on  the  node  values  of  the  nctwork.f  for  example,  the  following  circuit  can  be 
decomposed  into  2.  3  or  4  stages  depending  on  the  value  of  nodes  A  and  B. 


A 


°  F  =  A  xor  B 


B 


Figure  2.13.  Circuit  with  multiple  decompositions 

Node  F  is  always  in  a  separate  stage.  If  A  =  0  and  B  =  0.  then  C.  D.  and  E  all  form  a  single  stage;  if 
A=  1  and  B  =  0.  then  D  is  isolated  from  C  and  K:  and  so  on. 

When  RSIM  simulates  a  network,  it  docs  its  analysis  stage  by  stage.  Since  the  values  of  nodes  in 

a  stage  are  closely  related  (the  nodes  arc  shorted  together),  it  makes  sense  to  calculate  all  the  values  at 

the  same  time.  By  the  same  reasoning,  all  the  transistors  and  nodes  that  influence  the  value  of  a 

particular  node  arc  in  the  same  stage  as  that  node.  Stages  arc  the  analogs  of  gates  in  a  gate-level 

simulator.  In  a  gate  network,  each  node's  value  is  determined  by  a  single  gate,  and  the  output  of  a 

gate  is  electrically  isolated  from  the  inputs;  the  gate  is  the  ideal  unit  of  analysis.  In  MOS  networks  with 

bidirectional  devices,  the  traditional  gate  model  is  not  adequate;  hence  the  motivation  for  stages. 

tThis  differs  from  uic  notion  of  "transistor  group"  introduced  by  [Hr> aniRl )  A  transistor  group  contains  all  nodes 
that  m/g/ir  become  connected,  tr..  a  stage  with  all  switches  considered  to  be  conducting  Transistor  groups  can  be 
quite  large  —  for  example,  in  circuits  with  barrel  shifters  that  potentially  short  together  all  bits  in  a  data  path  — 
whereas  stages  arc  usually  quite  small. 
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\  simulation  step  Suill'  will’ll  ll'.i'  designer  ih.illgOs  the  v.iiue  Hi  .ill  Pip..’.  I  It,  ill  killin'!!  .11:'. 
node  iii en  .t  >.t!uc  by  [he  Jiiiiiiii  is  ti i. lied  .is  an  input  node  h>  the  smiul.itui  )  I  lie  mine  oi  tlic 
input  ititluenccs  other  pieces  of  tlii  network  in  two  ways: 


Figure  2.14.  Two  ways  in  which  an  input  afftcls  a  network 

lTic  simulator  first  recalculates  the  \alucs  of  nodes  in  stages  connected  to  the  input  by  the 
source/drain  connections  of  conducting  switches  (figure  2.14(a)).  Then,  for  each  switch  controlled  by 
tiie  input,  stages  on  each  side  of  the  switch  are  analyzed  (figure  2.14(b)),  If  the  switch  becomes 
conducting  because  of  the  new  input  value,  the  pieces  of  die  network  on  either  side  form  one  large 
stage.  If  the  switch  just  turned  off.  it  partitions  what  was  previously  one  large  stage  into  two  smaller 
stages. 

If  a  node  changes  value  as  a  result  of  analyzing  a  stage.  RSlM  calculates  the  transition  time  by 
estimating  the  length  of  time  required  to  charge/discharge  the  node's  capacitance.  The  name  of  the 
node,  its  new  value,  and  the  estimated  time  when  the  transition  to  the  new  value  occurs  are  all 
remembered  as  an  c\cnt.  The  simulator  maintains  a  list  of  pending  events,  keeping  the  list  sorted  by 
time,  with  the  earliest  event  first. 

When  processing  new  input  values  causes  a  node  to  change  value,  a  new  event  is  generated  and 
saved  on  the  event  list.  After  all  inputs  have  been  processed,  the  simulator  processes  events,  starting 
with  the  first  element  of  the  event  list.  For  each  event,  the  specified  node  is  assigned  its  new  value. 
Then,  any  stages  affected  by  this  change  (as  shown  in  figure  2.14(b))  arc  analyzed,  possibly  generating 
new  events,  which  are  then  added  tv'  the  event  list.  The  simulator  continues  processing  events  until 
the  event  list  is  empty.  Ihc  network  is  said  to  have  "settled"  at  this  point,  and  die  new  input  values 
have  been  completely  propagated  ihronp h  the  network. 
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Note  ih.it  it'  ih1  nodes  change  value  when  ,i  stage  is  analy/cd.  no  new  events  .ne  geiKiaicd 
I’ort.otis  ot  the  network  th.it  remain  quiescent  ere  not  analy/cd.  since  (he  simul.itot  only  aiia'v/cs 
stages  attested  b>  inputs  or  h>  nodes  on  the  event  list,  By  limiting  simulation  etlori  to  the  changing 
portions  ol  the  network  the  event  list  mechanism  enables  the  simulator  to  handle  Luge  eitcuits.  !  he 
amount  ot  computation  required  lor  a  simulation  step  i-  proportional  to  the  amount  of  ciicuit  aclivit,. 
not  die  si/e  of  die  circuit. 

lo  get  a  better  feeling  for  the  w.i>  a  change  propagates  dirough  a  network,  considci  the 
following  simulation  of  the  XOR  circuit  presented  in  figure  2.13.  Nodes  A  and  B  arc  inputs:  values  for 
the  other  nodes  are  determined  by  the  simulator. 
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Figure  2.15.  Waveforms  for  simulation  example 

Kvent  tt\.  Node  A  is  set  to  1  by  the  user.  Ihc  simulator  recalculates  all  stages 
affected  by  A.  in  this  case,  the  stage  containing  nodes  C.  1).  and  F. 

(which  form  one  stage  because  C  and  1)  are  1). 

All  three  nodes  are  pulled  down  by  the  switch  controlled  by  A.  so  events  tt 2 .  tt  3,  and  tt 4  are 

hcdulcd  to  set  C,  D.  and  F.  to  0.  Note  that  the  simulator  calculates  a  different  transition  time  for 

each  node.  C  changes  most  quickly  since  it  is  connected  directly  to  the  pulldown.  D  is  the  slowest 

since  it  discharges  through  the  two  pass  devices  connecting  it  to  die  pulldown. 

Kvent  ttl.  C  changes  from  1  to  0,  causing  the  stages  containing  D  and  F.  to  be 
analyzed. 

At  the  time  event  ttl  is  processed,  nodes  I)  and  F  arc  still  1.  although  they  both  have  events  pending 
for  transitions  to  0.  When  node  C  goes  low.  it  partitions  what  was  once  one  large  stage  into  two  stages 
—  one  containing  only  I),  the  other  containing  both  C  and  F.  Analysis  of  die  stage  containing  I) 
shows  that  I)  is  no  longer  pulled  down,  invalidating  die  upcoming  transition.  Ihc  simulator  has 


i. 
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several  choices: 

(1)  Nonce  that  I)  is  current!}  1.  so  just  remove  the  pending  event  lor  I)  l  lus  results 
in  I)  never  changing  value.  I  his  is  not  a  bad  prediction  if  I)  is  scheduled  to 
change  substantially  after  C. 

(2)  Schedule  another  event  (#5)  for  node  I).  which  changes  its  value  back  to  1:  set 
die  event  time  so  that  #5  happens  alter  #4.  I  his  choice  is  best  if  C  and  I)  are 
both  scheduled  to  become  0  in  close  succession. 

(7)  Remove  D's  pending  event  as  in  (1).  but  report  a  glitch  (an  aborted  transition)  to 
the  user  [Thoinpson74|:  a  sort  of  compromise  between  (1)  and  (2).  Some 
simulators  only  report  glitches  if  die  aborted  event  has  been  pending  "long 
enough"  [NahmSOJ. 

(4)  Schedule  another  event  as  in  (2)  that  changes  I  Vs  value  hack  to  1.  also  change 
the  pending  event  to  be  a  transition  to  X.  or.  alternatively,  remove  die  pending 
event  and  schedule  an  immediate  transition  to  X. 

As  one  can  sec.  scheduling  a  new  event  is  a  thorny  issue  when  it  involves  a  node  that  already  has 
events  pending.  Since  D’s  value  docs  not  really  matter  (it  does  not  control  any  switches  itself),  the 
first  alternative  seems  the  most  reasonable.  Given  die  simplicity  of  die  rsim  model,  it  probably  does 
not  pay  to  overly  complicate  the  scheduling  of  events.  The  transition-time  estimates  arc  not  accurate 
enough  to  allow  subtle  distinctions  to  be  made  based  on  the  relative  transition  times  of  nodes;  rsim 
~.oids  choices  (2),  (3).  and  (4)  since  they  involve  such  distinctions.  Note  dial  a  similar  problem  arises 
for  node  H.  It  has  an  event  pending  for  a  transition  to  the  correct  value  (F.  is  sull  going  low),  but  the 
event  could  be  rescheduled  to  reflect  a  faster  transition  time  since  die  pullup  on  node  D  no  longer 
impedes  the  transition.  Chapter  4  details  the  exact  choices  made  by  rsim  under  various  circumstances. 

Returning  to  the  example: 

Hvcnt  it 3.  Node  K  is  changed  to  0.  causing  the  stage  containing  node  F  to  be 
analyzed.  F  is  calculated  to  change  value,  so  event  #b  is  scheduled. 

Fvcnis  #4.5.  Discussed  in  the  preceding  paragraph. 

Fvcnt  tt 6.  F  is  set  to  1.  F  docs  not  affect  any  other  stages,  so  no  events  are  added 

to  the  event  list. 

At  this  point,  the  event  list  is  empty,  and  the  network  has  settled.  If  die  user  now  changes  node  R  to 
1.  a  somew  hat  simpler  sequence  of  events  ensues: 

Fvcnt  ttl.  Node  I)  is  set  to  1  by  the  user,  causing  the  simulator  to  analyze  the 
Mace  containing  1).  I)  is  predicted  to  go  low.  resulting  in  die  scheduling 
of  event  tt  8. 

Fvcnt  #8.  D  is  set  to  0,  separating  C  and  F  into  different  stages  which  arc  dien 
analyzed.  C  shows  no  change,  hut  F  is  scheduled  to  go  high  (event  #9) 
now  that  it  is  disconnected  from  Cs  pulldown. 
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Fvent  F  changes  to  1.  and  .is  .1  consequence  I  is  predicted  to  change  in  0 

(event  tf  W).  Note  th.it  die  low-to-high  transition  time  cun  he  very 
different  ili.m  the  high-to-low  transition  time:  RSIM  uikes  into  account 
the  relame  si/es  of  the  pullup  and  pulldown. 

Fvent  (i  10.  Finally.  F  is  set  to  0. 

Once  again  the  event  list  is  empty.  and  the  network  has  settled. 


2.4.  Calibrating  and  using  the  RSIM  model 

From  a  practical  viewpoint,  the  success  of  rsjm  depends  to  a  large  degree  on  the  choice  of  the 
modeling  resistance  for  each  transistor.  The  principal  goal  of  the  calibration  process  is  to  choose 
resistances  that  lead  to  accurate  predictions.  Actually,  there  arc  two  separate  sets  of  resistances  to  be 
chosen:  static  and  dynamic.  Static  resistances,  used  to  estimate  node  voltages,  arc  comparatively  easy 
to  choose.  When  a  circuit  does  not  depend  on  device  ratios  for  correct  operation  —  p.g..  a  pulled-up 
node  or  a  CMOS  gate  —  the  values  chosen  for  static  resistances  do  not  affect  the  voltage  computation, 
since  the  nodes  connect  to  only  one  polarity  of  input.  When  a  circuit  makes  a  connection  to  inputs  of 
different  polarities  —  e.g..  a  nMOS  gate  with  a  logic-low  output  —  the  intervening  nodes  become  part 
of  a  voltage  divider,  and  the  transistor  resistances  must  be  chosen  to  predict  the  divider's  output 
voltage.  Since  only  the  ratio  of  the  pullup  and  pulldown  devices  is  constrained,  there  is  considerable 
freedom  in  choosing  the  actual  resistance  values.  Of  course,  inauspiciously  chosen  values  can  run 
afoul  of  range  and  round-off  problems  in  the  computation,  but  such  problems  arc  easily  avoided. 

A  more  interesting  problem  is  the  choice  of  appropriate  dynamic  resistance  values.  One 
approach  involves  performing  a  scries  of  experiments  designed  to  measure  the  resistance  of  each  type 
of  transistor  in  various  circuit  contexts: 
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(a)  pullup  (b)  depletion  source- follower  (cl  n-channcl  source-follower 


(d)  n-channel  pulldown  (e)  n-channel  pulldown  w/  threshold  drop 
Figure  2.16.  Simple  experiments  for  measuring  channel  resistances 

Ideally,  the  experiments  should  be  performed  using  actual  circuits:  when  this  is  impractical,  a  well- 
calibrated  circuit  analysis  program  can  be  used  to  gather  the  needed  measurements.  Hach  of  the 
experiments  entails  measuring  the  length  of  time  required  for  the  output  to  rise  or  fall  from  its  starting 
voltage  to  the  switching  threshold.  (Section  3.4.1  describes  the  reason  for  using  single  threshold,  and 
the  method  for  choosing  it.)  If  the  load  capacitance  is  know  n,  an  appropriate  channel  resistance  can  be 
calculated,  essentially  inverting  the  computation  performed  by  RSIM.  Appendix  2  presents  the 
transistor  resistances  derived  in  this  manner  for  a  typical  5p  nMOS  process. 

Unfortunately,  while  the  experiments  outlined  above  lead  to  usable  predictions  of  circuit 
performance,  the  predictions  arc  not  as  accurate  as  one  might  like,  'lhc  problem  with  the  experiments 
is  that  the  resistance  measurements  arc  made  in  a  rather  artificial  context.  Factors  important  in 
determining  the  behavior  of  a  transistor  in  a  particular  circuit  (r.g..  shape  of  the  input  waveform. 
Miller  capacitances,  etc.)  arc  not  measured  by  the  proposed  experiments.  Since  the  simple  RSIM  model 
docs  not  account  for  these  factors,  they  arc  missing  completely  from  the  calculations,  leading  to 
inaccurate  predictions.  There  are  two  alternatives: 

(1)  Modify  the  RSIM  model  to  include  effects  deemed  important  when  making 
performance  predictions.  It  is  difficult  to  start  down  this  road  and  still  keep  the 
model  simple:  carried  to  its  logical  conclusion,  this  course  of  action  leads  to  a 
circuit  analysis  program  —  the  very  thing  RSIM  tries  to  avoid.  There  are. 
however,  alternatives  that  fall  short  of  abandoning  the  simple  model:  these  arc 
discussed  .it  the  cud  of  Chapter  3. 
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(2)  Conduct  more  sophisticated  experiments  using,  circuit  configurations  found  in 
actual  designs. 

An  example  of  die  second  approach  is  die  following  experiment: 


shapes  input  waveform  load  imener 


pair  delay 

f  igure  2.17.  Deriving  resistances  by  measuring  inverter  pair  delay 

The  delay  dirough  a  pair  of  inverters  involves  both  a  rising  transition  (measuring  die  pullup  resistance) 
and  a  falling  transition  (measuring  the  pulldown  resistance).  The  initial  inverter  provides  an 
appropriately  shaped  input  waveform:  die  last  inverter  provides  a  realistic  output  load.  'The  measured 
pair  delay  is  arbitrarily  split  into  a  rising  delay  and  a  falling  delay  (say.  3/i  and  %  respectively),  so  that 
the  pullup  and  pulldow  n  resistances  can  be  calculated.  This  leads  to  good  predictions  for  the  chains  of 
inverting  logic  so  common  in  MOS  designs.  Similar  experiments  can  be  designed  to  measure  other 
resistances.  The  danger  in  this  approach  is  that,  because  of  the  ad  hue  nature  of  the  experiments,  the 
resistances  might  be  inappropriate  for  new  circuit  configurations.  However,  with  a  prudent  choice  of 
circuits  during  calibration  and  design,  this  danger  can  be  minimized. 

The  following  examples  are  analyzed  using  the  simple  calibration  given  in  Appendix  2.  The 
results  give  a  fee!  for  the  performance  of  the  "pure"  resistance  model,  and  also  set  the  stage  for  the 
model  improvements  suggested  in  Chapter  3.  The  calculation  of  node  voltages  is  straightforward  and 
is  not  mentioned  in  the  discussion  below,  which  focuses  on  the  calculation  of  transition  times.  Ihe 
first  example  is  a  path  through  a  PL  A: 
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clock  signal  mpul  buffer  pol>  line  AM)  plane  Ok  plane 


Figure  2.18.  Sample  circuit  showing  path  through  Pl.A 

Transistor  sizes  are  given  in  microns  as  width/lcngth.  When  the  clock  signal  goes  high,  the  input 
signal  (buffered  by  the  inverter  on  the  left)  propagates  through  the  input  buffer  and  the  two  PI.A 
planes.  The  following  figure  shows  the  equivalent  rcsistor/capacitor  network;  resistances  arc  given  in 
Kft  and  capacitances  in  pf. 


Figure  2.19.  Equivalent  RC  network  for  PI,  A  circuit  (shows  dynamic  resistances) 


Note  that  the  pullup  for  node  C  is  recognized  as  a  depiction  source- follower  without  considering  the 
actual  voltage  on  its  gate.  Since  depiction  devices  arc  always  on.  the  inserter  which  leads  from  node  B 
to  the  gate  of  the  pullup  is  ignored  by  RSIM,  and  the  timing  for  node  C  is  always  controlled  by  node  B. 
Also  note  that  the  resistance  chosen  for  the  pulldown  for  node  B  reflects  the  threshold  drop  of  node 
A. 

When  calculating  Rdynlow  RSIM  simply  calculates  the  net  resistance  to  ground,  ignoring  the 
effects  of  any  pullups.  For  example,  a  falling  transition  for  node  B  takes  (16)(.05)  =  0.8ns.  This 
approach  is  not  only  simpler,  but  is  conservative.  (Adding  the  pullup  resistance  actually  decreases  the 
fall  time  from  the  'ITicvcnin  point  of  view).  Using  this  approach,  the  table  shows  the  results  of 
propagating  two  different  data  values  through  the  PI  A.  The  time  of  each  node's  transition  is  shown 
in  nanoseconds,  as  predicted  by  RSIM  and  SPICK. 
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The  discrepancies  between  Lite  RSIM  and  SPICT  predictions  (-28%  in  case  1.  -14%  in  ease  2)  can  be 
traced  to  the  fact  that  the  current  rsim  model  docs  not  account  for  the  shape  of  die  input  waveform 
when  analyzing  a  stagc.f  This  is  particularly  noticeable  in  ease  1  for  the  transition  of  node  K.  The 
long  rise  time  of  node  D  slows  the  falling  transition  of  K  to  a  considerable  extent:  a  fact  blithely 
ignored  by  RSIM. 

The  second  example  is  a  section  of  the  OM2  data  path  [Mcad80]  consisting  of  the  logic  to  drive  a 
register  select  line,  a  register  cell,  and  a  bus  line.  The  path  to  be  analyzed  starts  with  die  clock  going 
high,  driving  the  select  line  high,  finally  causing  the  register  cell  to  discharge  the  prc-charged  bus  line. 


Figure  2.20.  Register  select  and  bus  drive  circuitry  from  OM2  data  path 


t Examining  the  umes  in  this  example,  one  might  be  tempted  to  multiply  the  effective  resistances  by  a  constant  factor 
in  an  effort  to  improve  the  accuracy  of  the  predictions  llul  not  all  predictions  underestimate  the  true  transition  time, 
and.  as  will  be  seen  in  Chapter  j.  there  arc  other  improvements  that  can  be  made  that  address  the  root  of  the  prob¬ 
lem 


Figure  2.21.  Equivalent  RC  network  for  0M2  data  path  example 
The  comparative  analysis  is  given  below;  rsim  comes  to  within  9%  of  the  SPICE  prediction. 
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2.5.  Summary 

The  RSIM  model  can  be  summarized  as  follows; 

•  Transistors  arc  modeled  as  switches  with  series  resistors.  Three  resistances  are 
chosen  for  each  transistor  and  used  to  predict  node  voltages  and  transition  times. 
Resistance  values  arc  determined  by  experiments,  cither  with  actual  circuits  or 
using  a  circuit  analysis  program. 

•  Using  the  transistor  model,  a  network  of  transistors  and  nodes  is  simulated  as  a 
network  of  resistors  (from  transistors)  and  capacitors  (from  nodes).  A  node's 
value  is  determined  by  voltages  calculated  in  two  ways:  (1)  from  charge  sharing 
with  electrical  neighbors,  and  (2)  from  the  Thcvenin  equivalent  circuit  for  pieces 
of  network  connecting  the  node  to  the  inputs.  When  a  node  changes  value,  the 
timing  for  the  transition  is  given  by  an  RC  time  constant  calculated  using  the 
resistances  and  capacitances  of  the  surrounding  network. 

•  The  network  is  viewed  as  an  assemblage  of  small  stages,  each  simple  enough  that 
its  operation  can  be  predicted  in  a  straightforward  manner.  Information 
propagates  through  the  network  as  a  scries  of  events  (changes  in  a  node's  value); 
each  event  leads  to  an  analysis  of  affected  stages  using  the  models  described 
above.  The  isolation  between  stages  of  digital  circuits  allows  each  stage  to  be 
analyzed  separately;  the  relative  independence  of  one  stage  from  another  is  one 
reason  why  the  very  rough  approximations  of  RSIM  arc  so  serviceable. 

Several  factors  important  for  making  accurate  performance  predictions  arc  missi.ib  from  both  the  RSIM 

model  and  the  simple  calibration  experiments  proposed  in  section  2.4.  Chapter  3  suggests  some 

modifications  to  the  model  that  correct  the  more  important  oversights.  Many  implementation  details 


unspecified  in  this  chapter  arc  discussed  in  Chapter  4.  Chapter  4  also  catalogs  the  successes 
failures  of  the  RSIM  model,  as  finally  implemented. 


CHAPTHR  THREE 


Justification  of  the  Linear  Network  Model 


This  chapter  undertakes  a  performance  analysis  of  logic  gates  and  other  digital  circuits  with  the 
goal  of  establishing  a  physical  justification  for  the  RSIM  model.  By  comparing  the  resulting  equations 
with  those  proposed  by  RSIM.  one  can  judge  the  accuracy  with  which  the  RSIM  model  predicts  circuit 
behavior.  As  an  added  benefit,  insight  into  actual  circuit  operation  helps  to  motivate  model 
modifications  that  improve  the  accuracy  of  the  predictions. 

The  first  section  lays  the  groundwork  for  the  analysis,  presenting  the  first-order  equations  that 
describe  the  operation  of  MOS  transistors.  The  second  section  describes  the  node  voltages  found  in 
common  digital  logic  circuits  and  compares  the  results  to  RSlM's  predictions.  The  next  two  sections 
analyze  the  propagation  delay  of  logic  gates  and  other  network  components.  Finally,  several 
modifications  to  the  RSIM  model  arc  proposed,  and  the  resulting  predictions  arc  compared  to  those  of 
the  original  model. 

3.1.  Klectrical  models  for  mosfets  and  gates 

the  active  component  in  a  MOS  circuit  is  the  nwsfci.  a  type  of  transistor,  flic  niosfcl  has  three 
terminals:  the  source  and  drain  (two  symmetric  connections),  and  the  gate.  By  convention,  the  source 
and  drain  are  chosen  such  that  \js .  the  voltage  of  the  drain  with  respect  to  lire  source,  positive.  v¥J. 
the  voltage  of  the  gate  with  respect  to  the  source,  can  be  cither  positive  or  negative.  Depending  on 


-  36  • 


the  relative  voltages  of  the  three  lermm.iK.  the  mosfet  conducts  varying  amounts  of  airrenl  between 
the  source  and  drum  terminals.  I  he  umouni  of  current  conducted  depends  on  the  region  in  which  die 
mosfci  operates.  There  are  tliree  possible  regions: 
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where  \,h  is  the  threshold  voltage  of  the  mosfet  and 
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is  a  constant  that  depends  on  die  width  w  and  length  l  of  die  particular  mosfet  under  consideration. 
The  numeric  esumate  is  for  a  typical  nMOS  process.  ITicsc  equations  ignore  second  order  effects  on 
ids- 

In  an  nMOS  process,  there  arc  two  types  of  mosfets.  distinguished  by  the  setting  of  dictr 
thresholds: 


type  of  device  threshold  (VDD  =  1) 

n-channel  vf  ~  0.14 

depletion  ~  -  0.6 

As  we  saw  in  Chapter  2.  the  simplest  form  of  logic  gate  that  uses  these  devices  consists  of: 

a  single  depiction  pullup  with  its  gate  and  source  attached  to  the  output  node  and  its 
drain  attached  to  vdd.  id 

one  or  more  pulldown  paths  connecting  die  output  node  to  ground,  each  path 
containing  one  or  more  n-channcl  devices. 
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Figure  3.1.  n.\l OS  logic  gales 


The  depletion  pullup  is  configured  so  that  vgs  pu  -  0:  since  the  threshold  of  a  depletion  device  is 
negative.  vgs  pu  -  \,j  >  0.  and  the  pullup  is  never  off.  Kach  n-channel  pulldown  is  configured  to  be 
on  when  its  gate  voltage  exceeds  \,e  and  off  otherwise.  If  all  the  n-channel  devices  in  a  particular 
pulldown  chain  arc  conducting,  the  output  load  capacitance  is  discharged  through  the  pulldown  path 
and  the  output  voltage  is  lowered  ( vou!  =  v0/  =  logic  /oh);  otherwise  tire  pullup  pulls  the  output  high 
('W  =  'oh  =  logic  high). 

Equation  3.1  can  be  specialized  for  a  depletion  pullup.  using  the  fact  that  vgJ  :ptl  is  always  zero: 


~Y~  I  1 2  l‘7j|  <  (l-*w) 

Kpui  I  'id  I  -  — — — HI  -  'out  )  I  'id  I  >  (1  -  'out) 


(3.3) 


where  voui  is  the  voltage  of  the  gate/source  node  of  the  pullup.  Since  the  drain  of  the  pullup  is 
connected  to  VDD.  vjs:pu  =  1  -  youi.  To  avoid  confusion,  the  equations  will  be  written  in  terms  of 
|  \’rd  |  since  v,j  is  negative.  The  current  conducted  by  the  n-channel  pulldown  in  an  inverter  is  given 
by: 


Ipd 


0 

V  (v  _  \2 

2  v  ]ic) 

t  'our  v 

Kpd\\'in  'V  ~  y)vi 


out 


V m  -  're  <  0 
0  <  'in  're  ^  v our 

'in  ~~  v re  ^  'put 


(3.4) 


where  \/r  is  the  voltage  of  the  gate  node  of  the  pulldown.  Note  th.n  the  source  of  the  pulldown  is 
connected  to  ground  (vm  =  v gspj )  and  the  drain  is  connected  to  the  inverter's  output  (v„ul  =  ',hPj). 
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l  or  proper  operation  of  ihe  inveiter,  die  si/es  of  the  pullup  and  pulldown  are  chosen  so  that  irj  >  i,.u 
when  the  pulldown  is  on. 

lo  understand  the  behavior  of  an  inverter  in  more  detail,  it  is  useful  to  plot  of  die 
component  devices  as  a  function  of  the  inverter's  output  voltage: 


Figure  3.2.  mosfet  I-V  characteristics 


flic  ijs  of  a  depletion  pullup  depends  only  on  voul  and  thus  a  single  curve  suffices  to  show  their 
relationship.  For  the  n-channel  pulldown,  there  is  a  family  of  curves  for  ijs  corresponding  to  different 
values  of  \'i„. 

Ihe  intersection  of  the  ijs  curves  for  the  pullup  and  pulldown  shows  the  inverter's  output 
voltage,  given  a  particular  input  voltage: 


In  fact,  one  can  plot  the  1XT  voltage  transfer  curve  for  an  inverter,  which  shows  the  inverter’s  output 
voltage  as  a  function  of  its  input  voltage. 
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Figure  3.4.  Voltage  transfer  curve  for  an  inverter 


The  four  regions  (I  —  IV)  of  the  curve  correspond  to  various  combinations  of  the  pullup's  and 
pulldown's  operating  regions.  Note  that  the  relationship  between  vm  and  vuul  shown  in  figures  3.3 
and  3.4  applies  when  the  voltages  arc  allowed  to  stabilize:  in  a  circuit  with  changing  voltages,  the 
relationship  between  the  v,„  and  vout  is  considerably  more  complicated,  as  will  be  seen  in  section  3.4. 

Ihc  next  few  sections  use  the  equations  presented  here  to  develop  equations  for  the  quantities 
predicted  by  RSIM  —  node  voltages  and  transition  times  —  so  that  the  RSIM  model  can  be  evaluated 
and  perhaps  improved. 


3.2.  Node  voltages 

When  v,„  <  v,e.  the  n-channcl  pulldown  conducts  no  current:  the  depletion  load  continues  to 
conduct  as  long  as  vou,  <  1.  Therefore,  the  logic  high  output  voltage  of  an  inverter  is  given  by  the 
equation: 

v0h  =  1  (3.5) 

When  vm  >  v,c.  the  n-channcl  pulldown  is  on  and  the  output  node  reaches  an  equilibrium  voltage  v0i. 
which  is  determined  by  (I)  the  relative  sizes  of  die  pullup  and  pulldown  and  (2)  the  gate  voltage  tin 
the  pulldown.  v0/  is  that  voltage  w  here  the  current  of  the  pulldown  (at  this  point  in  its  linear  region) 
is  balanced  by  the  current  of  the  pullup  (in  saturation): 


•Cpji v,„  -  -  -y  )'■„/  =  |  v,rf  |  2 


(3.6) 
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It'  one  assumes  Out  v,„  =  1  (.is  is  the  case  when  i(,/,  of  l)ic  pros  ions  stage  is  1)  and  that 
«  1  v,,..  then 


_  I  Iwl3  _  0.21 
~  2R  (1  ->,,)  ~  K 


(3.7) 


where  K 


*pj 


Ipu 

M’n 


*'pd 


is  die  ratio  of  Ulc  si/cs  of  die  pcillnp  and  pulldown.  R  is  chosen  so  as 

*  pu  *pu  *pd 

to  guarantee  dial  the  low  output  of  a  gate  turns  off  the  pulldowns  of  gates  connected  to  the  output. 
u\.  so  that  i ;,/  is  less  than  \u.  by  a  comfortable  margin;  typically  R  is  chosen  to  be  about  4  if  v,„  =  1. 

Now  consider  die  RSIM  model  for  an  inserter: 


(a)  vjn  at  logic  low  (b)  \jn  at  logic  high 

Figure  3.5.  RSIM  inverter  model 


When  v,„  is  low.  the  pulldown  is  off  and  the  inverter  is  modeled  with  a  single  resistor.  In  this 
configuration,  rsim  predicts 

v0/i:RS/M  =  l  (3.8) 


agreeing  with  equation  3.5,  independent  of  the  value  chosen  for  Rpu .  When  r,n  is  high,  the  inverter  is 
modeled  by  a  voltage  divider.  RSIM  predicts 


Vol.RSlM 


Rpd 

Rpd  +  Rpu 


(3.9) 


One  should  choose  Rpu  and  Rpd  so  that  v0/k.s7.u  is  the  same  as  vp/.  as  given  by  equation  3.7.  Thus 
the  rsim  model  can  accurately  predict  the  output  voltages  of  logic  gates;  in  fact,  there  arc  two 
unknowns  and  only  one  equation  to  satisfy,  so  there  is  some  freedom  in  choosm  ;  c  sialic  resistance 
values. 
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There  .ire  circuits  fur  which  RSIM  dues  not  properly  predict  node  voltages.  l  or  example,  in  the 
following  circuit,  die  volume  of  node  11  only  reaches  1  -  vle: 


(a)  sample  circuit  (b)  equivalent  resistor  networks 

Figure  3.6.  Sample  circuit  illustrating  voltage  drop  across  pass  transistor 


N-channcl  devices  configured  the  same  way  as  the  horizontal  transistor  in  figure  3.6(a)  arc  called 
"pass"  transistors,  and  arc  used  to  implement  dynamic  latches,  various  types  of  steering  logic,  and  so 
on.  Figure  3.6(b)  shows  the  equivalent  resistor  networks  for  the  circuit.  According  to  this  model,  the 
voltage  for  node  11  should  reach  vnn  when  node  A  is  low.  In  the  actual  circuit,  however,  the  pass 
transistor  cuts  off  when  B  reaches  1  -  v,c  since,  at  that  point.  vt :s  poss  =  vu  .  In  general,  the  source 
voltage  of  a  pass  transistor  never  rises  above  a  threshold-drop  below  its  gate  voltage.  Thus  die  RSIM 
model  incorrectly  predicts  the  voltage  of  node  B. 

In  fact,  the  network  analysis  performed  by  RSIM  docs  recognize  that  node  II  never  reaches  von. 
As  shown  by  several  examples  in  Chapter  2.  die  rcsisunce  for  a  pulldown  with  a  gate  that  has  a 
threshold  volume  drop  is  not  chosen  in  the  same  way  as  the  resistance  for  a  normal  pulldown.  In 
other  words,  die  value  of  R5  in  figure  3.6(b)  reflects  die  knowledge  that  node  H  has  a  dircshold  drop. 
This  knowledge  could  also  be  used  to  adjust  the  prediction  of  M's  volutge.  but  this  is  not  currently  part 
of  die  calculation. 

There  arc  many  other  circuit  configurations  that  arc  beyond  the  ability  of  rsim  to  analyze, 
although  most  such  circuits  could  not,  in  all  fairness,  be  called  digital.  One  important  exception,  which 
RSIM  docs  not  handle,  but  which  occurs  in  performance-critical  digital  circuits,  is  called  bootstrapping. 


i 
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Figure  3.7.  Bootstrap  circuits  lead  to  voltages  greater  than  VDD 

Node  A  is  small  compared  to  node  If,  to  w hich  it  is  capacitivcly  coupled.  The  coupling  capacitor  need 
not  he  explicit:  often  enough  coupling  is  provided  by  the  gatc/sourcc  overlap  capacitance  of  the 
transistor  controlled  hy  A.  Node  A  is  driven  high  through  a  pass  transistor,  and  in  turn  enables  the  n- 
channel  pullup  that  is  controlled  by  A  and  connected  to  node  If  Since  the  capacitance  of  A  is  small 
compared  to  that  of  B,  A  reaches  a  significant  voltage  before  live  voltage  of  node  B  begins  to  change; 
the  difference  is  usually  around  3  volts  in  common  bootstrap  configurations.  As  the  v  oltage  of  node  B 
increases,  the  coupling  capacitor  maintains  this  initial  voltage  difference  between  nodes  A  and  B.  and 
so  the  voltage  of  A  increases  correspondingly. f  It  is  not  unusual  for  node  A  to  reach  8  volts  or  more. 
This,  of  course,  increases  the  voltage  on  the  gate  of  the  pullup.  which  in  turn  increases  the  current 
flowing  into  node  li.  The  net  result  is  that  node  H  reaches  its  final  value  much  more  quickly  than  one 
might  expect.  Just  as  important,  the  voltage  of  B  rises  all  the  way  to  von  instead  of  stopping  two 
threshold  drops  below,  as  a  simple  analysis  might  predict. 

Both  the  faster  transition  time  and  higher-than-cxpcctcd  voltage  for  node  B  arc  completely 
missed  by  RSIM.  Since  such  circuits  arc  often  used  in  time-critical  portions  of  the  network,  it  would  be 
nice  for  RSIM  to  make  correct  predictions  in  this  ease.  Unfortunately,  there  is  no  simple  change  to  the 
simple  RSIM  model  that  achieves  the  desired  result.  However,  by  systematically  replacing  bootstrap 
circuits  with  more  conventional  circuits  sized  to  give  the  same  performance.  RSIM  can  produce  the 
correct  results.  This  technique  is  discussed  in  the  section  on  escape  mechanisms  in  Chapter  4. 

In  summary,  RSIM 

tThc  pass  device  through  which  node  A  is  driven  isolates  A  from  the  dining  circuitry  Alter  the  voltage  of  node  A 
reaches  1  —  V/r.  the  pass  device  cuis  off.  and  stays  off  no  master  lai ye  the  voltage  on  node  A  becomes  This  is  be¬ 
cause  Vg  .pass  ~  }'ie  will  be  less  than  ihc  soilage  on  ciihcr  ihc  source  or  ihc  dram 
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(i)  predicts  the  output  voltage  of  logic  gates  with  acceptable  accuracy. 

(ii)  does  not  predict  threshold  drops  introduced  by  pass  transistors,  but  does  perforin 
a  static  analysis  of  die  network  to  rccogni/c  transistors  whose  gates  are  subject  to 
a  threshold  drop,  and  adjust  the  modeling  resistance  accordingly. 

(iii)  does  not  handle  bootstrap  and  other  more  exotic  circuits.  However,  a  pattern 
matching/rcplaccment  technique  is  available  for  substituting  equivalent  circuits 
that  simulate  correctly. 


3.3.  Propagation  delay:  overview 

When  choosing  a  single  number  to  characterize  the  timing  behavior  of  a  circuit,  one  often  settles 
for  determining  the  propagation  delay:  a  measure  of  the  length  of  time  required  for  a  change  in  an 
input  value  to  be  reflected  in  the  output  value.  In  digital  circuitry,  a  significant  change  is  one  where 
the  signal  changes  from  logic  low  to  logic  high  or  vice  versa.  For  a  particular  transition  it  is  common 
to  define  "change"  in  relation  to  a  threshold:  the  signal  is  said  to  change  when  it  crosses  the  threshold. 
Consider  the  following  single  input,  single  output  circuit: 


Figure  3.8.  Test  setup  for  measuring  propagation  delay 


The  propagation  delay  is  defined  as 


'p  —  I  output  I  input 


i output  is  the  time  when  the  output  voltage  crosses  the  output  threshold  voltage; 

t/npui  is  the  time  when  the  input  voltage  crosses  the  input  threshold  voltage. 

This  definition  works  well  for  a  transition  between  0  and  1:  however,  delays  associated  with  a 
transition  to  the  X  suite  arc  still  not  well  defined  since  it  is  unclear  whether  the  signals  in  question 
cross  the  threshold  or  not.  Aside  from  this  technical  difficulty,  the  notion  of  propagation  delay 
involving  X's  is  rather  muddy  since  X  is  not  a  "real"  logic  value,  but  more  of  an  error  state.  The 
simulation  algorithm  must  assign  some  delay  to  such  a  transition,  and  RSIM  conservatively  chooses  the 
fastest  possible  transition  of  which  the  node  is  capable  (sec  equations  2.8  and  2.9). 


i 
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I  ho  next  step  is  to  choose  the  input  and  output  thresholds,  a  choice  tli.it  depends  on  the 
particular  circuit  to  be  analyzed.  I  here  are  two  important  criteria  for  choosing  thresholds: 

(1)  Hie  delay  should  never  he  negative.  I  he  thresholds  should  he  chosen  so  that  the 
input  always  crosses  its  threshold  before  the  output  does.  Ihe  simulation 
algorithm  quite  naturally  processes  events  in  die  scheduled  order:  allowing  a 
negative  delay  might  require  hacking-tip  a  previously  processed  event. 

(2)  The  output  threshold  for  a  circuit  should  be  chosen  without  regard  to  its  use, 
allowing  a  single  threshold  to  he  chosen  for  all  inputs  and  outputs.  In  dial  case, 
only  one  delay  computation  is  needed  for  each  signal  transition. 

Though  these  criteria  are  not  compatible  in  general,  they  can  both  be  met  for  the  digital  circuits  of 

interest  here. 

To  simplify  the  analysis  below,  will  restrict  the  class  of  input  waveforms  considered.  In  his  work 

on  waveform  bounding.  Wyatt  [Wyati83|  observes  that  the  transfer  functions  characterizing  digital  mos 

circuitry  meet  certain  criteria  which  guarantee  that 

if  two  monotonic  trial  waveforms  are  chosen  that  bound  the  actual  input  waveform 
(which  also  must  be  monotonic),  then  the  response  of  the  circuit  to  the  trial  waveforms 
will  bound  the  actual  output  waveform. 

Thus  one  can  choose  computationally  convenient  input  waveforms,  c.g..  simple  voltage  ramps,  and 
determine  the  bounds  on  the  propagation  delay  by  analyzing  ramps  that  bound  die  true  input 
waveform. 


3.4.  Propagation  delay:  logic  gates 

In  order  to  explore  the  timing  behavior  of  MOS  logic  gates,  this  section  analyzes  die  behavior  of 
an  nMOS  inverter  with  a  simple  voltage  ramp  on  its  input.  The  analysis  is  based  on  the  first-order 
equations  for  the  component  devices,  presented  in  the  previous  section.  Ihe  derivation  is  easily 
extended  to  more  complex  gates  by  adjusting  the  parameters  of  the  inverter's  pulldown  to  model  the 
net  pulldown-path  resistance  of  the  currently  active  pulldowns  in  the  complex  gate  (sec  section  3.4.4). 
The  derivation  also  applies  to  CMOS  logic  gates;  the  analysis  of  the  low-to-high  transition  caused  by  a 
p-channel  pullup  •  very  similar  to  the  high-to-low  transition  caused  by  an  n-channcl  pulldown.  For 
simplicity,  only  nMOS  gates  arc  considered  below. 

For  the  purposes  of  the  analysis,  the  inverter  output  is  connected  to  a  fixed  capacitance  that 
models  the  load  driven  by  the  inverter. 
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Figure  3.9.  Inverter  circuit  to  be  analyzed 


At  each  moment,  the  output  voltage  and  the  current  charging/diseharging  the  load  capacitance  arc 
related  by 


i load  —  ('load 


dvout 

~dT 


(3.11) 


Unfortunately,  this  differential  equation  is  hard  to  use  as  it  stands  because  itoaj  is  a  function  of  both 
vou,  and  t.  However,  if  one  can  find  a  suitable  approximation  for  i/oaj  that  removes  the  dependency 
on  voul,  then  the  change  in  output  voltage  over  a  given  time  period  can  be  determined  by  integrating: 

(" !oad(&vout )  —  Jq  iloadit )  dt  (3.12) 

The  time  needed  for  vou,  to  change  a  specified  amount  is  calculated  by  first  performing  the  integration 
and  then  solving  the  resulting  equation  for  t.  This  suggests  the  following  plan  of  attack: 

(i)  Find  suitable  approximations  for  i/paj  to  remove  tire  dependencies  on  vou/. 

(ii)  Compute  the  output  transition  time  using  equation  3.12. 

(iii)  Subtract  from  (ii)  the  input  transition  time,  giving  the  actual  delay  from  input  to 
output.  Rearrange  die  answer  into  an  RC  term  (what  RSIM  predicts)  and  an 
error  term. 

This  discussion  starts  with  a  small  digression  on  choosing  the  appropriate  threshold  voltage. 


3.4.1.  Choosing  the  input/output  threshold 

To  see  if  one  can  choose  a  single  logic  threshold  and  still  guarantee  dial  die  predicted  delay  is 
never  negative,  it  is  useful  to  consult  the  voltage  transfer  curve  for  an  inverter: 
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I  pulldown -off  pullup  =  linear 

II  pulldown  =  . sal  pullup  linear 

III  pulldown  =  sat  pullup  - sat 

IV  pulldown  =  linear  pullup  =  sal 


Figure  3.10.  Voltage  transfer  curve  for  inverter 


The  transfer  curve  shows  the  static  behavior  of  the  inverter:  for  any  given  input  voltage,  it  tells  what 
the  output  voltage  must  be  for  the  pullup  and  pulldown  currents  to  balance.  If  the  input  changes 
rapidly  enough,  the  output  voltage  may  lag  behind.  If  the  input  is  going  from  low  to  high,  then  the 
transfer  curve  shows  the  minimum  output  voltage  for  a  given  input  voltage;  for  a  high-to-low  input 
transition,  the  transfer  curve  shows  the  maximum  output  voltage  for  a  given  input  voltage. 

Since  it  is  desirable  for  the  input  and  output  thresholds  to  be  the  same,  the  input/output 
threshold  voltage  v,f,rcs/,  is  chosen  to  be  the  point  on  the  transfer  curve  where  v,„  =  This  means 

that  during  a  low-to-high  input  transition,  if  v,„  <  vf/iresh .  then  vou,  >  v,i,resh  .  no  matter  how  fast  or 
slow  the  transition.  In  other  words,  the  propagation  delay  is  never  negative.  A  similar  argument 
applies  for  the  other  transition.  To  estimate  first  notice  that  at  the  region  H-rcgion  III 

boundary, 

vm  =  v„  +  and  vou,  =  1  -  |  v,j  |  (3.13) 

If  R  =4,  then  v,„  =  .44  and  vou,  -  .4,  and  so  v  thresh  is  in  region  11  (just  barely).  In  this  region  the 
pulldown  is  in  saturation  and  the  pullup  is  in  the  linear  region: 

^y-U’in  -  Vie)2  =  «/>«(  I  'W  |(1  -  vow)  -  "-)  (3.14) 

fThc  sarre  choice  of  threshold  has  been  made  in  several  other  simulators  (Koppcl78.  Nahm80] 


r 
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Setting  j/,.  and  solving  for  \,i,r,sh  yields  v,i,nSh  -  -439  —  close  enough  to  die  II- III 

boundary  that  the  distinction  is  not  important. 

3.4.2.  l.ow-to-high  output  transition  time.  tp^. 

To  calculate  ipn,.  an  approximation  for  //0(J(y  is  needed.  iioaj  is  just  the  difference  between  the 
pullup  current  (ipu)  and  the  pulldown  current  (ipj),  so  one  strategy  is  to  approximate  the  current 
through  each  component  individually.  Recall  that  \',hnsh  is  near  the  region  II -region  III  boundary  of 
the  inverter's  voltage  transfer  curve,  and  notice  that  the  part  of  the  transition  involved  in  the 
prediction  ( \ou,  rising  from  0  to  v,hr,-sh)  takes  place  almost  entirely  with  the  inverter  operating  in 
regions  111  and  IV.  litis  means  that  the  pullup  is  in  saturation,  Le„ 

ipu  =  2  I  vtd  =  it nax  (3.15) 

Choosing  a  specific  approximation  for  ipj  is  not  as  straightforward.  However,  a  good  starting  point  is 
an  approximation  of  the  form  shown  in  the  following  figure. 


'pd  ‘load 


(a)  approximation  for  ^  (b)  resulting  approximation  for  'joa£j 

Figure  3.11.  Approximation  of  lpCjfor  fpjh  calculation 


tojf  is  the  time  at  which  v,„  =  vtc.  At  this  point  in  the  development,  there  is  not  much  one  can  say 
about  ia.  the  time  at  which  the  pulldown  current  first  starts  to  decrease.  Certainly  ia  =  t0jf  is  an 
upper  bound  (resulting  in  a  step  function  for  ip(j).  Similarly.  ta  -  0  is  a  lower  bound  since  that  is  the 
time  when  the  input  voltage  first  changes.  The  choice  of  a  specific  value  for  /„  w  ill  be  discussed  later. 

With  this  approximation,  the  output  transition  time.  //,.  is  given  by 
( load(vllircsh )  =  J q  iloojO )  dt  (3.16) 


where 
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iloiiJ  (  I  )  — 


0 


/  <  ia 


i  -  ta 

i mill  ~  )  1  a  ^  I  ^  'off 

toff  t  a 

'max  I  off  ^  I 


Solving  equation  3.16  for  yields 


th  = 


Rpu  ('laid  +  toff  +  ta)  th  >  tq/T 

1 

(  toajdoff  ~  ta)l  ^  +  /a  th  ^  toff 


where  = 


v thresh 


h i 


Recalling  that  V/,  =  //,  -  tinpu,. 


(3.17) 


(3.18) 


tplh 


Rpu^load  +  (toff  +  t  Q )  —  tjnpul 
1 

[?■  R pu  (  load (toff  —  ta)]^  +  ta  —  t  input 


tplh  ^  toff  t inpUI 


tplh  ^  tmpui 


(3.19) 


I"hc  following  figure  plots  z^//,  as  a  function  of  lmpu, .  Note  that  there  is  a  relationship  among  the 
values  of  i,npu,.  t0ff.  and  ia.  For  this  plot,  a  linear  relationship  is  assumed  for  the  values.  Their  exact 
relationship  is  determined  by  the  shape  of  the  input  waveform,  a  topic  pursued  below. 


Vlh 


Figure  3.12.  lpjh  as  a  Junction  of  tjnpu{ 


Several  interesting  observations  can  be  made.  When  the  input  is  a  voltage  step.  tpjj,  ta.  and  tj„rut  are 
all  zero,  so  tpih:si,P  -  Rpu^lood-  i-c-,  a  simple  RC  time  constant  —  precisely  the  prediction  made  by 
the  rsim  mt)dcl. 
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lo  sec  what  happens  when  ihe  input  is  not  a  step,  notice  that 
Ip!  It  ^  Rpu^  loiiJ  I  ^  ( I  l'JJ  t-  la)  .  Iinput  (3.20) 


since 


l 

[2 Rpu(  l(iaj(loff  —  I a)\~  la 


(3.21) 


when  ipih  >  t0jj  -  i wpui ■  (This  can  be  verified  by  comparing  the  derivatives  of  the  two  sides  of  the 
inequality  or  by  simply  extending  the  linear  portion  of  the  ip/f,  curves  —  those  portions  above  the 
dotted  line  —  in  the  plot  above.)  F.quation  3.20  looks  like  the  response  for  a  step  input  delayed  by  an 
amount  that  depends  solely  on  parameters  of  the  input  waveform. 

Figure  3.12  provides  some  insight  into  the  choice  of  an  appropriate  value  for  /Q.  From  the  plot, 
one  can  see  that  tp/h  eventually  goes  to  zero  for  some  choices  of  /„.  but  increases  indefinitely  for  other 
choices.  By  determining  whether  ipn,  goes  to  zero  in  an  actual  circuit,  it  is  possible  to  narrow  the 
ranee  of  choices  for  If  the  input  changes  slowly  enough,  one  expects  the  output  voltage  to  follow 
the  voltage  transfer  curve  very  closely.  (This  is  essentially  the  definition  of  the  voltage  transfer  curve.) 
Thus,  when  v,„  =  v,limh ,  it  follows  that  \ou,  =  v,h,Csh  since  \,hmh  is  the  balance  point  of  the  inverter. 
This  implies  tpih  =  0  for  sufficiently  slow  input  transitions. 

Examining  the  bottom  term  of  equation  3.19,  one  can  see  that  lpih  is  zero  for  slow  input 
transitions  only  if  ia  <  twpu,. t  In  other  words,  if  ta  >  lmpul.  the  predicted  propagation  delay  can 
never  be  zero;  the  prediction  will  be  longer  than  the  true  propagation  delay.  Thus,  it  is  possible  to 
rewrite  equation  3.20  using  ia  =  impul  and  still  preserve  the  inequality. 


Ipih  ^  Rpu  ('load  T  ~2^loff  Iinput) 


(3.22) 


This  equation  can  be  simplified  still  further  with  some  assumptions  about  the  input  waveform. 


t  itle  boliom  lerm  has  the  form  jf(t)] 


+  g(t)  which  reaches  zero  for  large  I  onh  if  g(l)  is  negative 
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in 


Figure  3.13.  Assumed  input  waveform  for  hm- to- high  output  transition 

If  Oic  input  is  a  falling  voltage  ramp  which  starts  at  t  =  0  and  reaches  zero  at  t  =  5,  then 
tinpU,  -  (1  -  v thresh  )S  and  'off  =  (1  -  v„.)S.  Substitution  into  equation  3.22  yields 

iplh  ^  Rru(  toad  "h  2  ( 1  thresh  ~  *Vf)  =  Rpu^'load  "h  (0.15)5  (3.23) 

where  the  numerical  estimate  is  computed  for  a  typical  5p  nMOS  process.  Thus  rsim  potentially 
underestimates  tpth  for  a  logic  gate  with  a  slow  input  transition  (a  large  5).  As  6  decreases  (a  faster 
input  transition),  the  accuracy  of  RSlM's  predictions  increases.  Note  dial  Rpu  is  exactly  the  resistance 
measured  by  the  experiment  proposed  in  figure  2.16(a). 

3.4.3.  High-to-low  output  transition  time,  t 

In  the  previous  section,  the  equation  for  tpth  was  developed  by  overestimating  the  current 
through  the  pulldown,  leading  to  an  upper  bound  for  the  low-to-high  propagation  delay.  The  same 
technique  can  be  used  to  estimate  trhi.  the  high-to-low  transition  time.  In  this  ease,  however,  one 
wants  to  underestimate  die  pulldown  current  (and  overestimate  the  pullup  current)  to  find  an  upper 
bound  for  //,/,/. f 

For  the  portion  of  the  high-to-low  output  transition  which  is  of  interest  ( voul  falling  from  1  to 

'thresh)-  the  pullup  is  in  its  linear  region.  As  before,  ipu  can  be  approximated  by  the  pullup's 

saturation  current;  an  overestimate,  but  one  consistent  with  the  goals  of  this  section.  Also  as  before, 

estimating  the  pulldown  current  is  difficult.  Consider  the  following  diagram  of  various  load  lines  for 

+\1osl  MOS  circuits  use  muluplc  phase  docking  wuh  simple  lone  circuits  between  latches  controlled  b>  different 
phase  clocks  Ihis  means  that  circuil  performance  is  determined  h>  ihe  minimum  propagation  dclai  through  ihe 
simple  lone;  ihis  is  the  <ml>  quaiiim  estimated  h>  RSIM  Oilier  technologies  (111,1  (  I  )  support  single-clock.  s>n- 
chronous  designs  in  which  minimum  propagation  delars  can  be  very  important  for  correct  circuit  operation  This  is 
rare  in  MOS  circuits  and  such  designs  arc  not  supported  b>  RSIM 


MVf 
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tlio  pulldown.  Ihe  trajectory  of  a  load  line  shows  ipj  as  a  function  of  tune: 

'pd 

'max 

'thresh  *  'out 

Figure  3.14.  Load  lines  for  the  pulldown  for  various  input  transitions 

When  the  input  transition  is  fast  in  comparison  to  the  output  transition,  the  pulldown  turns  on  to  its 
maximum  current  capacity  (the  upper  load  line  in  figure  3.14).  As  voul  drops,  the  current  in  the 
pulldown  also  decreases,  and  the  trajectory  follows  the  maximum  current  curve  until  it  reaches  vthresh- 
When  the  input  transition  is  slow,  the  output  voltage  falls  fast  enough  to  keep  the  pulldown  and 
pullup  currents  balanced  (the  bottom  load  line  in  figure  3.14).  so  the  trajectory  for  ipj  follows  the  ipu 
curve. 

In  the  proposed  approximation,  ipj  rises  linearly  to  a  maximum  current  equal  to  the  actual 
current  through  the  pulldown  when  v,„  =  1  and  vou,  =  vthrcsh  ■  '111 is  certainly  underestimates  the 
actual  pulldown  current  for  a  fast  transition,  and  is  roughly  equal  to  the  pulldow  n  current  for  a  slow 
transition,  except  for  the  last  part  of  the  transition.  Fortunately,  in  this  portion  of  the  transition  (near 
the  threshold),  a  small  change  in  the  input  voltage  causes  a  large  change  in  the  output  voltage,  so  only 
a  small  amount  of  time  is  actually  spent  in  the  overestimated  pan  of  tire  transition,  litis 
approximation  leads  to  the  following  estimate  for  itoad' 


‘load 


Figure  3.15.  Istimatc  of  'phi  calculation 


where  /j  is  lire  time  .it  which  \,n  =  1  and  /max  is  the  maximum  pulldown  turreni  minus  the  pullup 
current. 


'mas  -  *,*/(! 


v  ir 


v  thresh  Kpu  . 

-----  )V,hr,ih  ~  — j  -  |  Vfd 


(124) 


As  before,  i a  will  be  chosen  to  ensure  that  the  estimate  is  an  upper  bound  to  the  actual  propagation 
delay. 

[he  derivation  of  a  formula  for  tpih  and  the  choice  of  ia  is  very  similar  to  that  of  the  previous 
section,  so  only  the  conclusion  is  presented  here: 


tphl  £  RpJ  (load  +  1  I  input) 


(3.25) 


where  Rpj  = 


1  —  v thresh 
ltmx 


If  the  input  is  a  rising  voltage  ramp  that  starts  at  i  =  0  and  reaches  1  at 


i  =  5,  then 


Iphl  ^  RpJ^'load  "b  j  0  '' thresh )  —  R pj  f ' ,'t.Kid  b  (0.28)5 


(3.26) 


As  before.  RSIM  potentially  underestimates  tpi,i  for  a  logic  gate  with  a  slow  input  transition  (a  large  5). 
As  6  decreases  (a  faster  input  transition),  the  accuracy  of  rsim's  predictions  increases.  Note  that  the 
experiment  proposed  in  figure  2.16(d)  docs  noi  measure  Rpj.  Instead,  the  experiment  measures  the 
average  resistance  associated  with  the  fast  input  transition  shown  in  figure  3.14.  omitting  the 
contribution  of  the  pullup.  This  resistance  is  less  than  R^,  although  is  it  not  clear  by  how  much.  This 
net  result  is  a  tendency  to  underestimate  iphi  by  the  original  rsim  model,  calibrated  as  in  Appendix  2. 

3.4.4.  Why  analyzing  inverters  is  sufficient 

The  results  of  sections  3.4.2  and  3.4.3  were  developed  for  the  nMOS  inverter.  This  section 
extends  the  results  to  nano  and  NOR  gates  as  well.  Hquations  arc  developed  for  the  amount  of 
current  flowing  through  the  NOR  and  nand  pulldown  configurations  and  then  the  results  are 
compared  with  the  equations  for  a  simple  inverter. 
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<i)  NOR  pulldown  configuration  <b)  NAM)  pulldown  configuration 

Figure  3.16.  Currents  through  \OR  and  X.iNP  transistor  configurations 


'Hie  propagation  delay  of  a  NOR  gate  with  a  single  active  pulldown  is  exactly  that  of  an  inverter.  If 
both  pulldowns  are  active  simultaneously,  inor  =  i\  +  ii.  since  the  current  through  each  pulldown  can 
be  computed  independently.  Thus,  when  both  pulldowns  are  on,  and  their  gates  arc  at  the  same 
voltage  (/>.,  logic  high),  the  total  current  through  the  pulldowns  is 


I  nor  — 


Ul  +  *C2)(v/„  -  V,e  -  ~-)vou, 

K1  +  ,2 

— ; (Vin  ~  vler 


( linear ) 
(saturated) 


(3.27) 


which  is  equivalent  to  the  current  through  a  single  pulldown  sized  so  that 


Ksinglc  pulldown  —  K1  +  Kj  (3.28) 

As  one  might  expect,  this  is  the  formula  for  combining  two  conductances  in  parallel. 

The  analysis  of  a  nand  gate  is  more  complicated  because  the  currents  through  the  two 
pulldowns  arc  not  independent.  The  currents  through  the  pulldowns  are  given  by 


* l( v/rr  -  Vm  -  v,f  -  —  Vm  Xvout  ~  vm)  ( linear ) 

-y(vm  -  vm  -  v,e)2  ( saturated ) 


(3.29) 


n  =  «2 (Vm  -  vic 


y)vm  (linear) 


(3.30) 
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whcrc  vm  is  tlic  voluigc  of  the  node  th.it  is  common  to  die  two  pulldowns.  Two  equations  .ire  needed 
for  the  top  pulldown,  because  the  pulldown  may  he  in  cidier  its  saturated  or  linear  region,  depending 
on  die  relative  values  of  v,„  and  vuul.  Only  one  equation  is  needed  for  the  bottom  pulldown,  because 
it  is  assumed  dial  vm  is  never  large  enough  for  the  bottom  pulldown  to  become  '■.unrated  In  die 
steady  state  m  must  equal  n.  This  gives  a  set  of  equations  to  solve  for  \m:  substituting  the  solution 
into  equation  3.29  yields  the  net  current  through  the  pulldown.  Hie  result  is 


and 


*1*2  .  vout 

('/n  '  te  x  P'out 

*1  +  K2  2 


K]K2 


2(k\  +  K2) 


(v» 


Vie)2 


( linear ) 

( saturated ) 


(3.31) 


This  is  the  same  amount  of  current  as  that  for  a  single  pulldowm  sized  such  that 


_  K\K2 

K single  pulldown  —  - ; - 

*1  +  *2 


(3.32) 


Again,  as  one  might  expect,  this  is  the  formula  for  combining  two  conductances  in  scries. 

The  conclusion  to  be  drawn  from  equations  3.28  and  3.32  is  that  the  current  flowing  through  a 
parallel  or  a  scries  configuration  of  pulldowns  can  be  modeled  as  the  current  flow  ing  through  a  single 
pulldown  of  the  appropriate  size.  This  means  that  the  formulas  for  the  propagation  delay  through  an 
inverter  arc  directly  applicable  to  more  complex  logic  gates. 


3.5.  Propagation  delay:  source-followers  and  pass  transistors 

The  analysis  wb;ch  follows  is  not  very  rigorous;  its  purpose  is  to  show  that  the  RSIM  models  for 
logic  gates  overesdmate  the  propagation  delay  through  a  circuit  containing  pass  transistors  and 
source-followers.  Although  better  estimates  would  be  desirable,  the  existing  models  arc  sufficient 
given  the  relatively  constrained  use  of  these  components  in  actual  circuits. 

A  source-follower  (so  called  because  the  voltage  of  the  source  node  "follows"  the  voltage  of  the 
gate  node)  is  an  n-channcl  device  with  its  drain  connected  to  vdd. 


-  55  - 


(a)  source- follower  circuit 


<h)  approximation  for  i|oa£j 


Figure  3.1 7.  Source-follower  circuit  configuration 

In  the  circuit  shown  in  figure  3.17(a),  the  output  voltage  of  the  source- follower  cannot  rise  higher  than 
a  threshold  drop  below  the  voluigc  of  its  input.  Thus,  the  maximum  voltage  for  the  output  of  a 
source- follower  is  1  -  v,e:  this  is  why  a  depletion  pullup  (which  can  drive  its  output  to  vnn)  is 
preferred  in  an  ordinary  logic  gate. 

Since  a  source- follower  can  only  pull  a  node  up,  only  the  propagation  delay  associated  with  the 
low-to-high  output  transition  needs  to  be  analyzed.  (A  rising  output  transition  corresponds  to  a  rising 
input  transition;  unlike  most  logic  circuits,  a  source- follower  docs  not  invert  the  sense  of  its  input). 
During  a  very  slow  input  transition,  the  output  voltage  tracks  the  input  voltage,  and  the  propagation 
delay  is  equal  to  the  time  needed  for  the  input  to  rise  from  to  \,hrCsh  +  For  a  ramp  input, 
this  implies  tpih  =  ( v,f  )5  =  (0.14)6  where  6  is  the  time  needed  for  the  input  to  rise  from  0  to  vdd. 

For  a  fast  input  transition  —  one  where  the  input  reaches  1  before  the  output  reaches  v,i,rcsf,  — 
the  current  through  the  source- follower  can  be  approximated  as  shown  in  figure  3.17(b).  ton  is  the 
time  at  which  v,„  =  v„,  and  t\  is  the  time  at  which  v,„  =  1.  ;max  is  estimated  by  the  average  current 
flowing  through  the  source-follower  during  the  transition: 


Ktf  n 

'max  —  — -r~  ( 1  —  vte 


")v  thresh 


One  can  calculate  tpth  using  an  approach  similar  to  that  of  section  3.4.2;  the  result  is 


tplh  —  Rtf  (toad  T  y  ff  1  +  ton)  ~  tinpul 


where  Rtf  -  -  ,h--h- .  If  the  input  is  assumed  to  be  a  voltage  ramp  with  transit  time  6.  the  final 
'max 

equation  for  tpn,  is 
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Iplh 


RsfflmJ  +  (0.35)5  (sum  US) 
(0.14)5  (largcS) 


(3.35) 


A  source-follower  is  usual!)  used  to  dmc  a  large  output  load,  so  when  6  is  small,  the  RC  term 
dominates.  I  his  suggests  dial  die  two  pieces  of  the  equation  can  he  reconciled  as 


Iplh  =  R*/(  load  +  (0.14)5  (3.36) 

This  equation  is  scry  similar  to  3.23.  which  describes  lpn,  for  an  ordinary  logic  gate,  so  no  special 
handling  is  needed  for  a  source-follower. 

In  the  analysis  of  section  3.4  and  die  first  part  of  this  section,  each  examined  device  had 
essentially  two  terminals,  since  one  terminal  of  each  device  connected  to  voi)  or  CM),  Moreover, 
input  signals  were  applied  to  the  gate  node  of  the  device.  The  analysis  now  turns  to  circuits  that 
contain  three-terminal  components,  ie.,  pass  transistors.  A  pass  transistor  is  any  transistor  not 
configured  as  a  pulldown,  pullup.  or  source- follower:  some  examples  of  circuits  containing  pass 
transistors  arc  presented  in  section  3.2. 

There  arc  two  basic  configurations  for  a  pass  transistor:  one  with  the  gate  node  as  input,  and  the 
source  and  drain  as  outputs:  the  other  with  the  source/drain  as  input,  and  the  drain/source  as  output 
(assuming  that  the  gate  is  at  logic  highf).  As  the  following  table  shows,  when  the  gate  of  a  pass 
transistor  is  the  input,  the  pass  transistor  behaves  like  one  of  the  components  analyzed  earlier. 


input 

(gale) 

source  or 

drain 

pass  device  acts  as 

analyzed  in 

falls 

rises 

pulldown  turning  off 

section  3.4.2 

falls 

falls 

enhancement  pullup  turning  off 

— 

rises 

falls 

pulldown  turning  on 

section  3.4.3 

rises 

rises 

source- follower 

beginning  of  this  section 

The  second  pass  transistor  configuration  presents  a  new  analysis  problem.  Assume  that  the  drain 
connection  is  the  input  (which  remains  constant)  and  that  the  source  node  undergoes  a  transition.  If 
the  drain  undergoes  a  step  transition  from  high  to  low  at  time  0,  and  the  source  follows.  [Horowit/83] 
suggests  the  best  estimate  for  the  voltage  of  the  source  is 


vsourcc(t)  —  1  ~  tanh(— — — — )  (3.37) 

Kpass  v  load 

t  Although  the  analysis  focuses  on  n-channcl  pass  transistors,  it  can  he  extended  to  p-channcl  pass  transistors  in  a 
straightforward  manner. 
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This  equation  can  rearranged  to  give  the  propagation  delay: 

Iphl  =  Rptiss  (  load  tanh  '(1  ~  ''thresh  )  =  $  M) R pass  ( load  (3.38) 

Similarly.  Horowitz  suggests  the  best  estimate  for  the  voltage  at  the  source,  given  a  rising  step  at  the 
drain,  is 


'’source  U)  =  1 - - -  (3.39) 

- — -  +  1 

Rpass  (  load 

which  gives 

tplh  =  R pass  C had  —  -  =  (O-lWpassCioad  (3.40) 

J  ~  V thresh 

In  both  eases,  the  RC  u:oe  constant  of  the  RSIM  model  overestimates  the  propagation  delay  of  a  step 
input.  For  a  slow  input  transition,  the  source  voltage  tracks  the  drain  voltage,  resulting  in  essentially 
zero  propagation  delay.  (In  this  respect,  the  delay  through  a  pass  transistor  is  similar  to  the  delay 
through  a  logic  gate.)  Although  no  direct  evidence  is  provided  here,  the  circumstantial  evidence 
indicates  that  the  predictions  for  propagation  delay  through  a  logic  gate  arc  upper  bounds  for  the 
propagation  delay  through  a  pass  transistor,  regardless  of  the  speed  of  the  input  transition. 

Pass  transistors  are  often  used  in  series  within  a  switching-logic  implementation  of  multiplexors, 
etc. 


input  ' 


R1 


A  _L_  B 

1  I - 1  1 

R2  |  '  R3  '  p~  R4 

r  r  i°  i 


C4 


Figure  3.18.  Pass  transistors  connected  in  series 


Horowitz,  extends  his  estimates  for  the  voltage  of  a  particular  node  e  to  a  chain  of  pass  transistors  by 
replacing  the  RC  terms  in  equations  3.38  and  3.40  with 

=  ^Rke^k  (3.4!) 

* 


where  Rkc  is  the  resistance  of  the  path  common  to  node  c  and  node  k.  Thus,  his  cstim.itc  for  the 


delay  associated  with  a  falling  transition  on  node  l)of  figure  3.18  is 

/,,«  =  (0.63)1  R  j(  1  +  (/?i  +  /f2)r2  +  (/?i  +  /?2  +  /?3)r3  +  (l?i  +  /?:  +  l?3  +  /f4K4  1  (3.42) 

If  all  the  resistances  arc  equal,  and  all  the  capacitances  are  equal.  tpi,i  =  6.3 RC.  The  RSIM  estimate 
for  the  same  transition  is 

tphl  =  (2**X20)  =  16*C  <3  33) 

*  * 

which  overestimates  the  delay  by  a  considerable  margin.  For  a  long  chain  of  pass  transistors,  the  RSIM 
estimate  is  very  pessimistic;  fortunately,  performance  constraints  limit  designers  to  chains  of  length 
four  or  less.  Nevertheless,  performance  prediction  for  a  circuit  containing  pass  transistors  is  clearly  an 
area  where  rsim  can  be  improved. f 

3.6.  Implications  for  the  RSIM  model 

The  analysis  of  the  propagation  delay  of  logic  gates  indicated  that  an  RC  time  constant  is  a  very 
good  estimate  for  the  delay  of  a  gate  when  the  input  waveform  is  a  voltage  step.  The  analysis 
concludes  that  a  simple  RC  time  constant  underestimates  the  actual  propagation  delay  if  the  input 
waveform  is  assumed  to  be  a  voltage  ramp  with  a  rise/fall  time  of  5.  More  accurate  estimates  for  the 
propagation  delays  arc 

tplh  ^  Rpu  (  load  T 

tphl  ^  Rpd^load  T  &  in  .rise  ^  ^ 

where 

&w:fa!l  =  j  ( S thresh  ~  vic  )6  *(0.15)5 

1  (345) 

^ m. rise  —  ^(l  -  vtliresh)S  ~  (0.28)5 

are  offsets  that  depend  only  on  parameters  of  the  input  waveform.  Section  3.5  shows  that  these 
equations  arc  satisfactory  upper  bounds  on  the  propagation  delay  through  other  (non-gate)  circuit 
configurations. 

til  is  straightforward  modification  of  RSIM  to  make  it  use  equations  3  38  and  3  40  instead  of  the  lumped  RC  formu¬ 
la  However,  these  cv|u.iuons  onl>  applv  to  circuits  containing  a  single  driver;  until  the  theory  is  extended  to  include 
multiple-diner  conjurations,  it  seems  safest  to  use  the  conservative  lumped  RC  approximation 


1’hc  computation  of  the  propagation  delay  would  be  easier  if  it  involved  only  the  r  of  the  output 
node.  A  rearrangement  of  the  time  accounting  accomplishes  this: 

(1)  Report  the  time  of  the  output  transition  as  happening  at  t  time  units  after  the 
input  transition. 

(2)  Schedule  the  event  associated  with  the  output  transition  for  r  +  A  time  units 
after  the  input  transition  where  A  =  (0.28 ){iolal  rise  time )  for  rising  transitions, 
and  A  =  (0.l5)(total  fall  time)  for  falling  transitions. 

In  other  words,  the  effects  of  the  input  risc/fall  time  arc  factored  in  when  the  input  transition  is 

scheduled,  so  the  A  terms  in  equation  3.44  can  be  omitted  when  computing  subsequent  r's.  This 

rearrangement  is  illustrated  in  the  following  figure. 


(a)  according  lo  equation  3.44  (b)  proposed  rearrangement 

Figure  3.19.  Rearrangement  of  time  accounting  fur  transitions 


The  total  rise  and  fall  times  of  a  transition  are  related  to  the  RC  time  constant  of  the  transition.  When 
the  input  is  modeled  as  a  ramp,  the  total  risc/fall  time  is  (2.3)t  since  r  is  measured  using  ^thresh  -  0.44. 
As  a  result,  the  transitions  of  a  given  node  can  be  handled  in  the  following  way: 

(1)  Compute  the  RC  time  constant  (t)  for  the  node. 

(2)  Report  the  time  of  the  transition  as  r  time  units  in  the  future. 

(3)  Schedule  the  associated  event  at 

(1.6)r  time  units  in  the  future  for  a  rising  transition,  or 
(1.3)t  time  units  in  die  future  for  a  falling  transition. 

Note  that  1.6  =  1  4-  (2.3)(0.28)  and  1.3  =  I  +  (2.31(0.15).  This  scenario  assumes  that  all 

consequences  of  a  rising  transition  involve  a  falling  transition,  and  vice  versa.  This  is  not  always  the 

ease  for  a  source-follower  or  a  pass  transistor,  but  the  error  involved  (the  difference  between  1.6  and 

1.3)  is  not  large  enough  to  be  significant.  Ifie  old  scheme  (accounting  for  the  input  transition  time 

during  each  delay  computation)  can  be  used  if  desired. 

Now  that  the  model  incorporates  some  information  about  the  input  waveform,  it  is  interesting  to 
review  the  examples  presented  in  section  2.4.  First  the  PI. A  calculations: 
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node 

transition 

T 

RSIM 

predicts 

transition 

SPICK 

predicts 

transition 

RSIM 

schedules 

event 

A 

1 

0.3 

0.3 

0.8 

0.4 

B 

t 

3.7 

4.1 

3.5 

6.3 

Case  1  C 

X 

0.9 

7.2 

6.8 

7.5 

D 

t 

9.1 

16.6 

15.5 

22.1 

H 

X 

0.9 

23.0 

20.7 

— 

A 

t 

1.6 

1.6 

0.6 

2.6 

B 

X 

0.8 

3.4 

1.9 

3.6 

Case  2  C 

t 

0.6 

4.2 

3.3 

4.6 

D 

X 

1.3 

5.9 

6.4 

6.3 

E 

t 

6.0 

12.3 

12.1 

— 

As  one  can  sec.  RSlvt's  estimates  are  now  better,  and  they  overestimate  transition  times  with  reasonable 
consistency.  (One  expects  overestimates  because  of  the  inequality  in  equation  3.44).  The  estimate  for 
Case  1  is  11%  greater  than  the  spice  prediction;  for  Case  2,  2%  greater.  The  story  is  similar  for  the 
0\12  data  path  example: 


node 

transition 

T  1 

RSIM 

predicts 

transition 

SPICE 

predicts 

transition 

RSIM 

schedules 

event 

— 

A 

X 

2.4 

2.4 

2.6 

3.1 

B 

t 

8.2 

11.3 

9.1 

16.2 

C 

T 

2.7 

18.9 

19.6 

23.2 

D 

X 

22.6 

45.8 

39.6 

— 

RSlM’s  prediction  is  15%  greater  than  that  of  SPICE.  Note  that  the  event  for  node  B  is  scheduled  using 
the  rule  for  a  rising  transition  —  formulated  assuming  that  any  consequent  transitions  will  be  falling  — 
even  though  node  C  is  also  undergoing  a  rising  transition.  This  accounts  for  much  of  the  overestimate 
by  RSIM. 

In  conclusion,  this  chapter  shows  justification  for  the  linear  transistor  model,  especially  if  all 
waveforms  can  be  modeled  as  steps.  Of  course,  transitions  arc  not  steps  in  actual  circuit  operation; 
this  fact  motivated  changes  to  the  linear  model,  still  allowing  it  to  provide  acceptable  predictions  of 
circuit  behavior. 
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CHAR  HR  FOUR 


Simulation  Using  a  Linear  Network  Model 


This  chapter  focuses  on  various  rsim  implementation  issues.  lhc  first  section  presents  a  detailed 
description  of  the  simulation  algorithm,  with  step-by-step  accounts  of  the  charge-sharing  and  final- 
value  computations.  Several  techniques  for  speeding  up  the  computations  arc  described  in  the  second 
section.  The  third  section  outlines  some  mechanisms  available  to  the  user  for  forcing  die  value  and 
timing  predictions  for  given  nodes.  The  chapter  concludes  with  an  evaluation  of  the  strengths  and 
weaknesses  of  RSIM. 


4.1.  The  RSIM  simulation  algorithm 

RSIM  uses  the  following  simple  recipe  for  simulating  a  circuit: 

(i)  Accept  new  input  values  from  the  user.  Perform  the  ncw-valuc  computation 
(figure  4.2)  for  each  new  input  value;  this  propagates  the  new  value  to  nodes 
connected  to  the  input  by  the  source/drain  connection  of  a  transistor  switch  (see 
figure  2.14(a)).  In  addition,  schedule  the  appropriate  event  so  that  any 
transistors  affected  by  the  new  input  value  will  be  processed. 

(ii)  Process  events  from  the  event  list,  stopping  (1)  when  the  event  list  is  empty,  (2) 
when  a  node  the  user  is  tracing  changes  value,  or  (3)  when  die  specified  amount 
of  simulated  time  has  elapsed. 

(iii)  I  oop  back  to  (i)  to  accept  new  inputs. 


The  main  loop  of  the  simulator  (step  (ii)  above)  is  described  in  the  following  figure.  The  node 
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associated  w  iih  each  event  is  assigned  its  ness  value,  and  all  stages  a  (Tec  ted  by  the  new  value  arc 
located  and  processed.  (An  a  (Tec  ted  st.ige  is  one  that  contains  a  source/drain  node  —  called  a  skJ 
node  —  of  a  transistor  which  has  the  event  node  as  their  gate.)  I  he  processing  of  a  stage  has  two 
steps:  first  a  charge-sharing  computation  for  the  stage,  then  a  calculation  of  the  final  value  of  each 
node  in  the  stage.  Before  each  of  the  two  steps,  the  COMl’l  l  l  flag  of  each  seed  node  is  set  to  indicate 
that  the  stage  containing  the  seed  node  needs  processing.  A  stage  is  processed  only  if  its  seed  node 
has  the  COMl’l  I  t  tlag  set:  as  part  of  the  processing.  COMl’l  it  flags  for  nodes  in  die  current  stage  arc 
reset.  This  mechanism  ensures  that  a  stage  is  processed  only  once,  even  if  it  contains  more  dian  one 
seed  node. 

while  event  list  not  empty  { 

n  :  =  node  associated  with  first  event  on  event  list 

remove  first  event  from  event  list 

set  n's  value  to  the  value  specified  by  the  event 

/*  Jo  charge-sharing  compulation  for  each  affected  stage  f see  ■ eetion  4.1.1}  */ 
for  each  transistor  with  n  as  gate  node,  set  COMPUTE  dag  for  source 
for  each  transistor  t  with  n  as  gate  node 

if  t  has  just  turned  on  and  COMPUTE  still  set  for  source  node 
do  charge-sharing  computation  for  source 

/*  do  new-value  computation  for  each  affected  stage  [see  figure  4.2]  */ 

for  each  transistor  with  n  as  gate  node,  set  COMPUT  flag  for  source  and  drain 

for  each  transistor  with  n  as  gate  node  { 

if  COMl’UT  still  set  for  source,  do  new-value  computation  for  stage  containing  source 
if  COMPUT  E  still  set  for  drain,  do  new  -value  compulation  for  stage  containing  drain 

} 

} 

Figure  4.1.  Main  loop  of  RSIM  algorithm 

Note  that  the  charge-sharing  computation  deals  only  with  the  source  stage  of  each  transistor,  but  die 
jnal-value  computation  deals  with  both  the  source  and  drain  stages.  This  is  because  the  charge- 
sharing  calculation  only  deals  with  transistors  known  to  be  on;  therefore,  the  source  and  drain  belong 
to  die  same  stage,  and  a  stage  computation  involving  the  source  automatically  involves  the  drain. 

ITic  procedure  for  calculating  the  final  value  for  each  node  in  a  stage  is  outlined  in  the  following 

figure. 


t 

1 

1 
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intiiali/c  connection  Use  to  have  starting  node  as  only  element 
set  pointer  to  beginning  of  connection  list 

if  starting  node  is  an  input,  input  found  :=  true,  else  input  found  false 

/*  find  all  nodes  in  current  stage  */ 
w  hile  pointer  not  at  end  of  connection  list  { 
n  :  =  node  currently  pointed  at 
for  each  "on"  transistor  with  source  connected  to  n  { 
if  drain  is  an  input,  input  found  :  =  true 
else  if  drain  not  on  connection  list,  add  drain  to  end  of  list 

} 

advance  pointer  to  next  list  element 


/*  compute  new  final  value  for  each  node  in  stage  */ 

if  no  inputs  found,  all  done  (charge-sharing  has  computed  the  correct  value) 

else  for  each  node  on  connection  list  { 

if  node  is  an  input,  do  nothing  (its  value  is  set  by  user) 
compute  final  value  for  node  [section  4.1.21 

reset  vism-D  flag  (set  by  final-value  computation)  for  each  node  on  connection  list 
reset  node's  COMPUTE  flag 

} 


Figure  4.2.  Subroutine  to  compute  new  final  value  for  every  node  in  stage 


'Hie  details  of  the  charge-sharing  and  final-value  computations  arc  presented  in  the  next  two 
subsections,  followed  by  a  description  of  event  management  in  RSIM. 


4.1.1.  Charge-sharing  computation 

When  a  transistor  turns  on.  its  source  and  drain  nodes  become  pan  of  the  same  stage.  As 
explained  in  section  2.2.  if  the  voltages  of  all  the  nodes  in  a  stage  arc  not  already  identical,  they 
become  so  through  charge  sharing.  In  order  to  calculate  the  charge-sharing  value  for  each  node.  RSIM 
computes  three  summary  capacitances  from  the  capacitances  of  each  node  in  the  stage: 

Chigt,  total  capacitance  of  nodes  with  current  suite  of  logic  high. 

C/ow  total  capacitance  of  nodes  with  current  stale  of  logic  low. 

Cx  total  capacitance  of  nodes  with  current  suite  of  X. 

The  summary  capaciuinces  arc  used  to  compute  the  charge-sharing  value  for  the  stage,  as  specified  by 
equations  2.3  and  2.4: 


charge-sharing  value 


q  f  high  +  f  v _ 

f  Ion  "t"  f  high  4-  (  x 
j  f  high 

(  hm  +  (  high  +  (  X 

A  otherwise 


^  v/0B 
>  Vfugh 


(4.1) 
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An  event  is  scheduled  for  e.ich  nude,  specifying  .in  immedi.ue  ir.insitnm  to  ihc  ch.iree  sh.inng  value. 
(See  section  4.1.3  to  find  out  wh.it  happens  to  new  events.) 

I  he  charge-sharing  computation  is  outlined  in  die  following  figure,  flic  procedure  performs  a 
tree  walk  ot  a  stage  starting  with  a  node  passed  as  an  argument  from  die  new -value  procedure.  Since 
the  nodes  in  the  stage  do  not  require  processing  in  a  particular  order,  the  procedure  is  implemented 
without  recursion. 


initialize  list  to  have  starting  node  as  only  element 
set  pointer  to  beginning  of  list 
reset  capacitance  accumulators 

/*  visit  till  mules  in  stage,  compute  summary  capacitances  */ 
while  pointer  not  at  end  of  list  { 
n  :  =  node  currently  pointed  at 
add  capacitance  of  n  to  appropriate  accumulator 
for  each  "on"  transistor  t  with  source  connected  to  n  { 
if  drain  is  an  input  or  static(t)  >  maxrcs.  do  nothing 
else  if  drain  not  on  list,  add  drain  to  end  of  list 

} 

advance  pointer  to  next  list  element 

} 


/*  make  each  node  in  stage  have  charge- sharing  value  */ 
compute  charge-sharing  value  using  equation  4.1 
for  each  node  on  list  { 

reset  node's  COM  PU  L  flag 

schedule  immediate  transition  to  charge  sharing  value 

i 

Figure  4.3.  Mon-recursive  routine  for  charge  sharing  compulation 


If  the  resistance  of  a  transistor  is  large  enough,  its  source  and  drain  nodes  might  not  share  charge  —  at 
least  not  very  quickly.  The  user  can  specify  a  maximum  resistance  parameter  ( maxrcs )  that  controls 
•\c  scope  of  the  charge-sharing  calculation;  the  traversal  of  nodes  in  a  stage  stops  at  transistors  with  a 
resistance  greater  than  maxrcs.  I  hc  COMPUL  flag  indicates  to  the  main  RS1M  loop  which  stages  have 
been  processed  by  the  charge-sharing  calculation;  the  main  loop  uses  the  flag  to  ensure  that  the 
charge-sharing  calculation  is  performed  only  once  for  each  stage. 

Kquation  4.1  leads  to  incorrect  results  when  the  surrounding  network  contains  X  transistors 
(transistors  with  gates  of  X).  A  portion  of  the  network  that  can  be  reached  only  through  X  transistors 
might  not  be  connected  to  the  original  node  at  all.  and  so  should  not  make  an  active  contribution  to 
die  node's  charge-sharing  value.  An  alternative  (suggested  by  Dave  Gross)  is  the  use  of  capacitance 
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intervals  to  accumulate  the  contribution  of  X  connections.  In  this  scheme,  die  capacitance 
accumulators  have  interval  values,  c.g..  (  =  [(  'lugh-min.  (  |.  I  he  minimum  value  is  die 

total  capacitance  of  nodes  guaranteed  to  be  connected  to  the  current  node:  the  maximum  value  also 
includes  die  capacitance  of  nodes  only  reachable  by  X  transistors.  A  separate  charge-sharing 
computation  occurs  for  each  node  in  die  stage,  as  outlined  in  die  following  figure. 

if  node  is  input,  ( ),,Kh  =  Ciovt  =  Cx  -  [0,0] 
else  { 

local  ('high  '■ =  local  :=  local  Cv  :=  [0,0] 

add  node's  capacitance  to  max  and  min  of  accumulator  for  node's  value 

set  MSI  l  l  l>  flag  for  current  node 

for  each  "on"  transistor,  t.  with  source  connected  to  current  node  { 
if  drain  does  not  have  \  1SI  I  l  n  flag  set  { 

recursively  determine  parameters  for  drain  node 
if  value  of  gate  node  for  t  is  not  X  { 

local_0„s/,.min  :=  local_C/„s/,  .min  4-  O^.min 
local_(  /OH  .min  local_C/01v.min  +  C/014.min 
local  Cx  .min  :  =  local_Cx.min  +  Cx.min 

} 

local_C/,,j>/,.max  :  =  locai  r^.max  +  C high  max 
local_C/0„  .max  :  =  local_O0M  .max  +  C/0H,.max 
local  C  x  .max  :  =  local_Cx  .max  +  Cx  .max 

} 

} 

set  Or  -h  =  local  e^,  and  so  on 


Figure  4.4.  Subroutine  to  compute  capacitance  intervals 

The  results  determine  the  maximum  and  minimum  node  voltage,  which  determine  die  charge-sharing 
value  for  the  node: 


charge-sharing  value 


1 

X 


C high -max  +  cx.ma.x 

”  s-  r*  ^  v/ow 

t/0H  .»•//;  +  C  high -mm  +  Cx.mm 

_ (high -min _  y 

Cion.max  +  Ch,sh.inax  +  Cx.max  h'sh 
otherwise 


(4.2) 


Capacitances  for  nodes  connected  by  X  transistors  contribute  to  the  final  value  only  in  a  negative 
sense.  i.c..  dicy  may  cause  a  node  to  go  to  X.  but  never  contribute  to  a  value  of  0  or  1.  1  caving  the 
vistill)  flag  set  as  each  new  node  is  discovered  ensures  that  each  node  is  visited  only  once.  After 
completing  the  charge-sharing  computation  for  a  node,  its  t  OMI'l  1 1  flag  is  reset;  die  MSI  IT'D  flags  for 
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.ill  nodes  in  ihc  singe  .ire  also  reset,  in  preparation  for  the  next  node's  computation. 

One  disadvantage  of  tire  interval  approach  is  that  a  separate  calculation  is  performed  for  each 
node  in  the  stage,  whereas  die  original  scheme  required  only  one  calculation  per  stage.  In  addition, 
the  interval  calculation  must  be  performed  by  a  recursive  tree  walk  to  ensure  the  correct  handling  of  X 
transistors.  Fortunately,  tins  computation  can  be  merged  with  the  tree  walk  described  in  the  following 
section,  so  the  incremental  cost  is  fairly  small. 

4.1.2.  Final-value  computation 

lire  final,  driven  value  of  a  node  is  determined  by  the  resistance  of  paths  from  the  node  to 
various  inputs.  As  we  saw  in  chapter  2.  a  convenient  way  to  characterize  these  paths  is  to  calculate  the 
Thcvenin  equivalent  for  the  portion  of  the  network  that  can  be  reached  from  the  node  of  interest. 
Hquation  2.6  relates  the  final  value  of  a  node  to  V,hev .  the  Thcvenin  equivalent  voltage.  The  ume 
constant  for  a  transition  in  the  value  of  a  node  is  also  determined  by  the  surrounding  network;  the 
necessary  parameters  can  be  computed  during  the  Thcvenin  calculation. 

For  computational  convenience.  RSIM  actually  computes  RH  and  RL.  the  resistances  of  a  resistor 
divider  that  represents  the  effect  of  the  surrounding  network. 

nei  resistance  of  all  paths  to  VDD 

net  resistance  of  ail  paths  to  GND 

Figure  4.5.  characteristic  resistor  divider  for  a  node 

RH  and  RI.  might  be  resistance  intervals  (RH  =  [/?///.  RH/,]  and  RL  =  [RL,.  Rl.h ])  if  there  arc  X 
values  in  the  surrounding  network.  The  Thcvenin  equivalent  voltage  is  easily  calculated  from  the 
characteristic  divider: 


V'hcv  =  [  17.  Vk  ]  =  I 


RI.,  Rl.h 

RL,  +  RHh  '  RLh  +  RH, 


) 


(4.3) 


For  example,  the  lowest  possible  voltage  is  calculated  using  the  least  resistance  to  CM)  (specified  by 
RI i)  and  the  greatest  resistance  to  von  (specified  by  RH/,).  Couching  the  computation  in  terms  of 
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the  characteristic  resistance  is  advantageous  for  several  reasons.  Resistances  to  vnn  and  (At) 
represent,  in  a  natural  wav.  the  connections  made  by  mos  logic,  as  shown  in  chapter  3.  With  the  aid 
of  some  simple  rules,  it  is  easy  to  incrementally  analy/c  any  MOS  network  in  terms  of  its  component 
resistances.  Because  resistances  arc  directly  related  to  the  implementation,  they  can  represent  certain 
circuit  configurations  —  r.g..  short  circuits  (RH  -  Rl  =  0)  —  dial  cannot  be  simply  characteri/ed 
using  the  Thevenin  equivalent,  flic  remainder  of  the  section  describes  a  tree  walk  algorithm  to 
compute  the  parameters  needed  for  determining  a  node's  value  and  for  scheduling  the  appropriate 
transition. 

The  computation  of  RH  and  Rl  proceeds  by  tracing  paths  to  the  inputs  that  are  reachable  from 
the  node  of  interest,  and  then  calculating  the  resistance  of  each  path,  starting  at  the  input  and  working 
back  toward  the  original  node.  Two  rules  arc  helpful  for  calculating  path  resistance.  The  first  rule 
specifics  the  apparent  path  resistances  when  a  divider  exists  on  the  other  side  of  a  resistor: 


Figure  4.6.  Reduction  rule  for  resistor  divider  with  series  resistor 


The  parameters  for  the  apparent  resistances  (T  and  B  in  figure  4.6(b))  cannot  be  determined  exactly, 
an  approximation  is  therefore  necessary.  Appendix  3  explains  why  this  is  so.  and  derives  the  following 
formulas  for  the  approximation: 


A  i  =  Pi  +  Ri  +  Ri 
Rl  =  Qi  +  Rl  +  Ri 


Rl  .  p  ,  *  ph  ,  p  Rh 

Qi  At  =  p‘  +  RlTi  +  *'  Qi 

Ql  n  n  ,  n  Qh  ,  p  Qh 

ri  s*  =  Qh  +  R'-Qi  +  R'7i 


(4.4) 


lhc  second  mlc  is  much  simpler;  it  indicates  how  to  merge  the  resistances  of  two  separate  paths  to 
obtain  the  net  resistance  for  both  paths: 
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(a)  dividers  for  iwo  parallel  paths 
Figure  4.7.  Reduction  rule  for  combining  two  parallel  paths 


To  compute  the  Thevenin  equivalent  for  a  particular  node,  one  starts  by  locating  all  conducting 
transistors  connected  to  that  node  and  then  recursive!)  analyzing  Lite  network  on  the  other  side  of 
each  of  the  transistors.  Fach  node  is  marked  as  its  analysis  begins:  recursive  calls  ignore  portions  of 
the  network  involving  marked  nodes.  This  keeps  the  analysis  expanding  outward,  eventually 
terminating  at  a  dead-end  (no  paths  leading  to  unmarked  nodes)  or  an  input.  These  particular  circuits 
are  easy  to  analyze,  as  shown  in  the  following  figure. 


(a)  low  input  (GND)  (b)  high  input  (VDD)  (c)  dead-end 

Figure  4.8.  Characteristic  dividers  for  input  nodes  and  dead-ends 


TTtc  resistance  of  paths  leading  from  a  particular  node  arc  combined  using  the  two  reduction  rules 
above.  Using  the  first  rule,  the  results  of  a  recursive  call  (shown  as  P  and  Q  in  figure  4.6)  are 
combined  with  the  resistance  of  the  conducting  transistor  leading  to  that  piece  of  the  network  (shown 
as  R).  to  yield  the  net  resistance  of  the  path.  This  resistance  is  combined  with  the  resistances  from 
other  recursive  calls  using  the  second  reduction  rule.  When  all  paths  have  been  accounted  for,  the 
analysis  for  the  node  is  complete.  Ihe  resulting  divider  is  the  desired  answer,  or,  is  used  as  part  of  the 
analysis  of  some  other  node  if  the  analysis  was  performed  because  of  a  recursive  call.  The  process  is 
diagramed  in  the  following  figure. 
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(c)  after  applying  first  reduction  rule  <d>  after  applying  second  reduction  rule 

Figure  4.9.  Network  analysis  by  repeated  rule  application 

The  complete  analysis  procedure  is  outlined  in  the  next  figure.  The  results  are  stored  in  eight 
global  variables: 

RH  resistance  interval  for  net  resistance  of  all  paths  to  von.  Path  resistance 

computed  using  static  resistance  of  each  transistor. 

RL  resistance  interval  for  net  resistance  of  all  paths  to  GND.  Path  resistance 

computed  using  static  resistance  of  each  transistor. 

Rvdd  net  resistance  to  vnt).  computed  using  the  dynamic-high  resistance  of  each 
transistor.  Simple  scrics/parallcl  calculation;  paths  containing  X  transistors 
arc  ignored. 

Rg„d  net  resistance  to  G\T).  computed  using  tire  dynamic-low  resistance  of  each 
transistor.  Simple  scrics/parallcl  calculation;  paths  containing  X  transistors 
arc  ignored. 

Rx  net  resistance  to  all  inputs,  computed  using  the  dynamic-high  resistance  to 
high  inputs,  and  dynamic-low  resistance  to  low  inputs.  Simple  scrics/parallcl 
calculation:  includes  paths  containing  X  transistors. 

C high  total  capacitance  of  nodes  with  current  state  of  logic  high. 

C/o»  total  capacitance  of  nodes  with  current  slate  of  logic  low. 

Cx  total  capacitance  of  nodes  with  current  state  of  X. 

If  the  interval  charge-sharing  calculation  is  merged  with  this  calculation,  the  upper  limit  of  die 

capacitance  intervals  in  the  charge-sharing  calculation  can  be  used  in  place  of  the  three  capacitance 
accumulators  just  defined.  The  procedure  also  uses  four  stack-allocated  local  variables  to  accumulate 
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thc  first  four  quantities  listed  above,  during  the  calculation  for  each  node, 
if  node  is  logic  low  input  { 

return  with  RH  =  Rvdj  -  oo  and  Rl.  =  R^j  -  Rx  =  0 
}  else  if  node  is  logic  high  input  { 

return  with  KH  -  Rvjj  =  Rx  -  Oand  Rl.  -  Rg„d  -  00 
}  else  { 

local  Rvjj  :=  local _Rsnj  :=  local /?,  :=  local_RH  :=  local_RI.:  =  <x> 
add  node  capacitance  to  appropriate  accumulator 
set  visrn  D  flag  for  current  node 

for  each  "on"  transistor,  t,  with  source  connected  to  current  node  { 
if  drain  docs  not  have  visited  flag  set  { 

recursively  determine  parameters  for  drain  node 
combine  static(t)  with  RH  and  Rl.  using  first  reduction  rule 
combine  result  with  local  RH  and  loeal  RL  using  second  reduction  rule 
if  value  of  gate  node  for  t  !  =  X  { 

local _Rvjj  :=  local  Rrdd  ||  (dynhigh(t)  +  Rvdj) 

\iKd\_Rs„d  :=  ItKal  Rgnj  1|  (dynlowi(t)  +  Rg„d) 

} 

local  Rx  local  /? ||  (min(dynhigh(t).dynlow(t))  +  Rx) 

} 

} 

set  Rvdd  =  local  /?, RH  =  local  RH.  and  so  on 


Figure  4.10.  Subroutine  to  compute  parameters  of  resistor  divider 

Marking  each  node  as  it  is  visited  (by  setting  its  VISITED  flag)  avoids  cycles  and  keeps  the  tree  walk 
expanding  outward  from  the  starting  node.  If  the  network  docs  contain  cycles,  the  subroutine  only 
approximates  the  true  resistance  to  VDD  and  GND.  For  example,  consider  the  following  logic  gate 
where  the  output  (the  pulled-up  node)  is  the  node  of  interest: 


(a)  circuit  containing  c)dcs  (b)  circuit  as  analyzed  (c)  circuit  as  analyzed  if  marks  removed 


Figure  4.1 1.  Analysis  of  circuit  containing  cycles 

Since  the  marks  arc  not  removed  when  the  analysis  of  a  path  is  completed,  RSIM  treats  the  cycle  as  if 
the  circuit  were  configured  as  shown  in  the  circuit  in  figure  4.11(b).  This  approximation  results  in  an 
overestimate  of  the  actual  resistances.  If  a  node's  mark  were  removed  as  the  procedure  exited,  all 
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paths  through  the  network  would  be  explored  (as  shown  in  figure  4.11(c));  in  this  ease,  the  resistance 
would  be  underestimated,  leading  to  optimistic  performance  predictions. 

Cycles  are  relatively  rare  in  nMOS  designs;  when  they  occur,  the  extra  path  is  often  redundant. 
lc..  tire  circuit  is  designed  to  work  correctly  if  any  path  in  the  cycle  became  the  sole  connection,  this 
means  die  approximation  used  by  rsivi  is  usually  not  out  of  line  with  die  designer's  intentions.  This 
statement  holds  for  CMOS  as  well,  with  one  notable  exception  —  the  CMOS  pass  gate: 


Abar 


A 

Figure  4.12.  A  eMOS  pass  gate 


In  this  circuit  configuration,  one  device  is  sized  to  carry  most  of  the  load,  and  the  other  exists  simply  to 
ensure  no  threshold  drop  across  the  gam.  In  analyzing  such  a  circuit.  RSIM  arbitrarily  chooses  the 
transistor  that  makes  the  connection;  the  other  transistor's  contribution  is  ignored.  This  is  sadsfactory 
if  the  transistor  with  the  smaller  resistance  is  chosen,  but  such  is  not  always  the  case.  To  correct  the 
problem,  the  transistor  list  for  each  node  can  be  arranged  in  order  of  increasing  resistance;  this  ensures 
paths  of  least  resistance  arc  examined  and  marked  first.  Note  that  diis  solution  only  works  when  the 
paths  in  a  cycle  have  a  length  of  one  transistor  (as  in  the  pass  gate  above).  If  the  paths  arc  longer, 
there  is  no  guarantee  diat  the  path  of  least  total  resistance  will  happen  to  start  with  the  transistor  that 
has  the  least  resistance. 

After  the  various  parameters  are  calculated,  the  final  value  of  a  node  can  be  calculated  using 
equations  2.6  and  4.1: 

0  I'l,  <  v/on  or  ( old  value  =  0  and  RH/  =  oo) 
final  value  =  1  V)  >  or  (old  value  =\  and  Rli  =  00)  (4.5) 

X  ulhenxisc 

Ihc  extra  clause  for  "0"  and  ”1"  values  prevents  a  node  from  being  unnecessarily  forced  to  X  when  it 
has  no  connection  to  inputs  of  the  opposite  logic  state.  The  appropriate  event  is  scheduled  Rcjj( 
seconds  in  the  future,  where 
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Rcjj  - 


RgiuJ  final  value  -  0 

R\  Jd  final  value  =  1 

Rx  final  value  =  X 


(cJJ  = 


C high  +  cx 
( Ion  +  ( x 
f Ion  T  C high 


final  value  =  0 
final  value  =  1 
final  value  =  X 


(4.6) 


(4.7) 


(he  disposition  of  this  event  depends  on  the  nature  of  any  pending  events  and  the  node's  current 
value;  sec  section  4.1.3  for  the  details  of  event  management. 

The  user  has  some  control  over  the  final-value  computation.  The  time  constant  for  event 
scheduling  can  be  forced  to  1.  implementing  a  unit-delay  simulation.  This  is  useful  when  a  node  value 
is  to  be  calculated  using  transistor  resistances,  but  transition  timing  is  not  important.  Another  option  is 
flagging  those  events  corresponding  to  transitions  to  X.  where  the  X  value  is  specifically  caused  by  a 
ratio  error  (rather  than  other  X's  in  the  network).  Such  transitions  arc  characterized  by  RHh  < 
and  Rl.h  <  °°;  if  an  X  exists  in  the  surrounding  network,  one  or  both  of  these  parameters  is  infinite. 
When  a  flagged  event  is  processed,  the  transition  is  reported  to  the  user  as  a  ratio  error.  Because  the 
error  report  is  delayed  until  the  flagged  event  is  processed,  short-lived  ratio  errors  (those  caused  by 
small  differences  in  propagation  delays)  arc  ignored,  and  the  error  reports  reflect  only  significant  ratio 
errors.  Of  course,  in  some  designs,  even  long-lived  ratio  errors  might  not  affect  correct  circuit 
operation,  so  the  reporting  is  optional. 

When  RHi  =  RLj  =  oo,  the  node  is  not  connected  to  any  inputs,  and  the  charge-sharing 
computation  described  in  the  previous  section  correctly  computes  the  node's  final  value.  Ordinarily, 
the  final-value  calculation  docs  not  schedule  any  events  in  this  ease,  but  the  user  can  optionally 
request  the  scheduling  of  a  charge-decay  event.  A  charge-decay  event  sets  the  node  value  to  X  after  a 
specified  interval  which  the  user  can  set.  At  first  glance,  it  might  seem  odd  to  schedule  all  decay 
events  using  the  same  interval;  a  more  suitable  estimate  might  be  based  on  factors  such  as  the  node's 
capacitance,  the  number  of  transistors  connected  to  the  node,  and  so  on.  However,  precise  predictions 
arc  not  necessarily  the  most  useful  here.  The  actual  decay  time  for  MOS  circuits  is  in  the  millisecond 
range.  Since  it  is  unlikely  that  a  simulation  spans  that  long  a  period  or  simulated  lime,  a  precise 
accounting  of  the  decay  time  never  results  in  a  decay!  A  more  useful  approach  is  based  on  the 
observation  that  a  designer  usually  intends  for  all  dynamic  nodes  to  be  refreshed  every  few  clock 
cycles.  When  the  decay  time  is  set  to  an  interval  slightly  larger  titan  the  intended  refresh  rate,  the 
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unrcfrcshcd  nudes  decay  quickly,  and  the  user  receives  a  suitable  error  report.  Thus,  even  a  short 
simulation  run  catches  a  decay  problem.  This  type  of  debugging  experiment  can  be  much  more 
effective  than  a  precise  estimate  in  pinpointing  a  problem. 

4.1.3.  Kvent  Management 

Up  to  two  events  can  be  pending  for  a  node: 

(1)  a  charge-sharing  (CS)  event.  CS  events  arc  always  immediate  events,  Le„  they 
are  scheduled  for  the  current  simulated  time. 

(2)  a  final-value  (FV)  event,  scheduled  for  sometime  in  the  future. 

Thus,  up  to  two  transitions  are  possible  for  a  given  node.  Kach  event  corresponds  to  a  real  transition, 
Le.,  the  new  value  of  a  CS  event  always  differs  from  the  current  value  of  the  node,  and  the  new  value 
of  a  FV  event  differs  from  that  of  the  CS  event  (or  the  current  node  value  if  there  is  no  pending  CS 
event).  Since  only  two  transitions  can  be  pending  at  any  moment,  newly  calculated  events  must  be 
merged  with  the  pending  events.  Section  2.3  hinted  at  the  issues  involved;  in  general,  RSIM  makes  its 
choices  based  on  the  principle  that  the  most  recently  calculated  event  best  reflects  the  current  network 
configuration.  Since  no  information  is  available  that  explains  why  any  pending  events  were  created, 
there  is  little  (if  any)  reason  to  save  a  previously-calculated  event  in  preference  to  the  newer  one. 

The  following  figure  describes  the  simple  merging  rules  used  by  rsim: 

if  merging  new  CS  event  { 

abort  pending  CS  and  FV  events 

if  new  charge-sharing  value  is  different  from  current  node  value 
schedule  new  CS  event 

} 

if  merging  new  FV  event  { 

if  new  value  differs  from  CS  value  (or,  if  no  CS  event  pending,  current  node  value) 
schedule  new  FV  event 

} 

Figure  4.13.  Merging  a  new  event  with  pending  events 

A  new  CS  event  aborts  a  pending  FV  event  because  a  new  final-value  computation  always  occurs  after 
the  charge-sharing  computations  arc  complete.  Although  this  approach  is  simple,  it  occasionally  leads 
to  pessimistic  predictions.  For  example,  if  one  input  of  a  two-input  NOR  gate  turns  on  substantially 
before  the  other,  the  propagation  delay  is  actually  determined  by  die  time  of  the  first  input's 
transition.  With  the  merging  scheme  outlined  above,  the  two  events  scheduled  at  the  time  the  second 
input  turns  on  cause  other  events  to  be  aborted  —  those  scheduled  because  of  the  fust  input's 
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iransiiion.  This  occurs  even  if  one  of  ihc  aborted  events  is  scheduled  for  an  earlier  time  than  the 
second  event.  In  other  words,  w  ith  the  merging  scheme  above,  the  propagation  delay  of  a  SOK  gate 
might  be  incorrectly  measured  from  the  later  input.  Ihcrc  is  no  simple  fix  to  the  merging  rules  above 
that  solves  this  problem.  I  hc  correct  solution  requires  knowledge  of  both  the  new  CS  event  and  tire 
new  TV  event,  so  that  pending  events  can  be  saved  if  they  are  compatible  with  both  newer  events.  If 
the  charge-sharing  and  final-value  calculations  arc  merged,  as  suggested  at  the  end  of  section  4.1.1.  it 
should  be  straightforward  to  'implement  the  correct  merging  scheme. 

Ifterc  arc  several  alternatives  for  dealing  with  aborted  events.  The  simplest  approach  is  to 
handle  the  event  as  if  it  were  never  scheduled,  />..  do  nothing.  This  is  the  approach  rsim  adopts. 
Another  approach  is  motivated  by  the  physical  significance  of  an  aborted  event.  Since  the  signal 
changes  between  the  transition  start  time  (the  time  when  the  charge-sharing  or  final-value  computation 
was  performed)  and  tine  transition  end  time  (the  scheduled  time  of  the  event),  die  action  of  aborting 
the  event  corresponds  to  a  stop  in  mid-transition.  Aborted  transitions  arc  termed  glitches 
[1  hompson74];  these  malformed  signals  sometimes  have  significant  impact  on  the  operation  of  a  circuit 
and  should  be  reported  to  the  user.  This  report  can  be  in  the  form  of  a  forced  transition  to  X.  or  just 
a  simple  error  message.  Interestingly,  a  user  who  has  the  option  to  receive  glitch  reports  almost  always 
disables  that  feature  [Ulrich73],  The  reason  given  is  that  the  duration  of  an  aborted  transition  is 
usually  short  enough  so  that  the  actual  signal  docs  not  change  significantly;  hence  no  glitch  actually 
occurs,  t 

Scheduling  an  event  entails  inserting  it  into  the  event  list,  placed  according  to  its  scheduled  time. 
An  event  list  implemented  as  a  simple  list  would  impose  a  noticeable  scheduling  overhead,  rsivi 
adopts  several  techniques  for  reducing  this  overhead.  It  quantizes  simulated  time,  and  rounds  off  each 
event  time  to  the  nearest  time  quanta;  in  the  current  implementation,  the  time  quanta  is  0.1 
nanosecond.  Ilic  event  list  is  implemented  in  two  pieces: 

(1)  an  event  array.  Bach  array  element  is  a  doubly-linked  list  of  events  for  a 
particular  time  quanta. 

(2)  an  overflow  list,  a  doubly-linked  list  of  events,  sorted  by  event  time. 

This  organization  is  similar  to  that  found  in  many  conventional  gate-level  simulators  [Vauchcr75. 

tSome  researchers  propose  showing  transitions  between  logic  states  as  0-X- 1  or  1-X-0.  where  the  initial  transition  to 
X  happens  immediately  Ihus  aborted  events  leave  the  node  value  at  X  until  some  subsequent  event  re-establishes  a 
legitimate  logic  slate  This  suggestion  doubles  the  number  of  events  tn  a  simulation,  a  cost  which  might  outweigh  the 
advantages 
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U!rich76],  The  event  lists  arc  doubly -linked  to  allow  quick  removal  of  an  aborted  event  from  the  list. 
The  data  structures  are  diagramed  in  the  following  figure. 


overflow  list 

I  * — — ) - >ev  ent - ent - > 


Figure  4.14.  The  event  list  is  implemented  with  an  event  array  and  overflow  list 

The  event  array  is  managed  as  a  circular  buffer  in  which  the  N  array  elements  hold  events  for  the  next 
<V  time  quanta.  An  array  index  indicates  which  array  element  corresponds  to  the  current  simulated 
time.  If  a  new  event  is  scheduled  for  a  time  M  quanta  in  the  future,  where  M  <N,  the  event  is  added 
to  the  end  of  the  event  list  stored  in  array  element  { index  +  M)  mod  N:  no  sorting  or  searching  is 
required.  If  M>N,  the  event  is  inserted  into  the  overflow  list  according  to  its  scheduled  time,  lhe 
array  size  is  chosen  so  that  most  events  are  scheduled  directly  into  die  array.  With  a  time  quanta  of 
0.1  nanoseconds,  a  128-  or  256-element  array  captures  most  events  in  modern  mos  designs.  Note  that 
events  arc  added  to  the  end  of  an  event  list.  This  ensures  that  events  arc  processed  in  first-in,  first-out 
order,  Le..  in  the  order  created.  Thus,  causc-and-cffccl  relationships  arc  preserved. 

To  find  the  next  event  to  process,  the  event  array  is  searched  starting  at  the  current  index,  until 
an  event  is  found.  Kach  increment  of  the  index  corresponds  to  advancing  simulated  time  by  one  time 
quanta.  If  the  array  is  empty,  simulated  time  is  advanced  to  equal  die  scheduled  time  of  the  first 
event  on  the  overflow  list:  this  event  becomes  the  next  one  to  processed.  When  an  event  is  located 
for  processing,  the  overflow  list  is  examined  to  find  events  whose  scheduled  times  arc  less  than  A’  time 
quanta  away  from  the  new  simulated  time.  Such  events  arc  moved  from  the  overflow  list  to  the 
appropriate  list  in  tine  event  array.  Ibis  preserves  the  first-in,  first-out  event  ordering  mentioned 
above. 
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4.2.  Speeding  up  the  simulation 

No  simulator  is  fast  enough.  Increased  simulator  performance  is  always  in  demand,  cili.er  to 
achieve  faster  turnaround  during  the  design  process,  or  to  allow  more  complete  testing  during 
verification.  ITiis  section  discusses  several  techniques  for  improving  the  performance  of  the  algorithms 
presented  in  the  previous  section. 

It  is  not  surprising  to  learn  that,  during  event  processing,  most  of  the  time  is  spent  in  the  final- 
value  calculation. t  To  compute  the  final  value  for  a  given  node,  the  final-value  computation  must  visit 
all  the  nodes  in  the  current  stage.  Thus,  if  there  arc  n  nodes  in  the  stage,  processing  the  entire  stage 
takes  0(n2)  time.  Since  the  remainder  of  the  processing  is  proportional  to  the  si/c  of  the  stage,  the 
real  bottleneck  is  the  final-value  computation.  Performance  can  be  improved  by 

(1)  introducing  a  cache  for  final-value  computations,  with  the  intent  of  eliminating 
the  recalculation  of  parameters  for  subnetworks. 

(2)  reducing  the  number  of  nodes  in  the  stage. 

(3)  reducing  the  cost  of  each  calculation,  for  example,  by  substituting  integer 
arithmetic  for  floating-point.  This  alternative  will  not  be  discussed  further, 
except  to  note  that  a  32-bit  integer  has  over  9  orders  of  magnitude  of  dynamic 
range,  sufficient  for  representing  MOS  resistances. 

Clearly,  the  first  improvement  is  most  significant  when  n  is  large.  The  third  improvement  is  important 
when  n  is  small  and  the  dominant  cost  is  the  actual  arithmetic.  The  second  improvement  works  on 
making  (3)  more  important  than  (1).  The  improvements  arc  discussed  in  turn  below. 

As  it  is  currently  formulated,  the  final-value  procedure  performs  many  redundant  computations. 

Consider  the  circuit  diagram  for  a  5-nodc  stage  shown  in  (a)  below,  and  one  of  its  subcircuits,  shown 
in  (b)  below. 


tThc  discussion  in  this  section  is  limited  to  that  portion  of  the  simulator  which  propagates  new  values  through  the 
network.  RS1M  has  an  interpreted  I. ISP-like  command  language  which  the  designer  uses  to  prepare  new  input 
values  and  process  the  results  of  a  simulation  step  Depending  on  the  sophistication  of  the  simulation  environment 
built  by  the  user,  a  substantial  portion  of  the  total  time  can  be  spent  in  the  command  language  interpreter.  Of 
course,  there  is  room  for  improvement  here  loo.  but  that  is  outside  the  scope  of  this  thesis. 
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(a)  5-node  stage 


<b>  example  subnrcun 


Figure  4.15.  Stage  containing  5  nodes  and  4  transistors 


When  one  traces  the  computations  performed  by  the  final-value  procedure  (sec  figure  4.10).  it 
becomes  apparent  that  the  parameters  for  a  specific  subcircuit  are  calculated  several  times.  Ihe 
computations  for  nodes  A.  B.  and  C  all  need  the  same  information  about  the  subcircuit  in  figure 
4.15(b);  there  is  no  reason  to  compute  the  information  more  than  once. 

The  amount  of  redundant  computation  can  be  reduced  by  caching  the  result  from  each  call  to 
the  final-value  procedure.!  Before  each  call,  the  cache  is  searched  to  sec  if  the  subcircuit  was  analyzed 
previously;  if  so.  the  results  arc  taken  from  the  cache  and  not  recomputed.  If  the  cache  has  constant 
access  time,  die  cost  of  the  final-value  analysis  for  a  stage  is  reduced  to  ()(n ).  a  significant  saving 
when  n  is  large.  In  RS1M,  the  cache  does  not  need  to  accommodate  arbitrary  amounts  of  information; 
associating  two  cache  entries  with  each  transistor  (one  for  the  source,  one  for  the  drain)  is  sufficient. 
The  source  cache  retains  the  network  parameters  for  the  subnetwork  connected  to  the  drain  node 
(including  the  transistor),  and  the  drain  cache  is  similar.  When  the  analysis  of  a  subnetwork  is 
completed,  the  result  is  placed  in  the  appropriate  cache. 


(b)  circuit  after  analysis  of  subnet  ft  2 


Figure  4.16.  Transistor  cache  scheme 


In  the  figure  above,  once  subnet  #1  has  been  analyzed  and  the  result  saved  in  the  source  cache, 
subsequent  analyses  involving  the  same  transistor  and  subnet  use  the  cached  result.  The  following 
tThis  caching  technique  is  known  in  the  IJSP  community  as  mcnioizanon 


figure  shows  die  cache  status  after  calculation  of  die  final  value  for  node  1)  of  figure  4.15(a). 
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figure  4.17.  (ache  status  after  final- value  calculation  for  node  D 

Subsequent  analysis  of  node  C.  for  example,  requires  only  a  single  recursive  call  (rather  dian  four  as 
before). 

There  are  several  reasons  why  the  transistor  cache  might  not  be  the  ideal  solution.  ITie  amount 
of  information  in  each  cache  entry  —  8  parameters  —  is  quite  large  compared  to  the  transistor  data 
base.  This  suggests  that  cache  entries  should  be  dynamically  allocated  when  needed,  and  returned 
when  the  computation  is  complete.  The  combined  costs  of  storage  management  and  cache  access 
might  exceed  the  cost  savings  realized  on  stages  of  modest  size.  These  objections  can  be  addressed  by 
associating  cache  entries  with  nodes  instead,  or  using  the  cache  only  when  the  stage  exceeds  a 
specified  size. 

However  the  cache  is  organized,  its  introduction  has  a  substantial  impact  on  the  amount  of 
computation  required  for  the  final-value  analysis  of  a  stage.  Another  improvement  mentioned  at  the 
beginning  of  the  section  is  reducing  the  number  of  nodes  in  a  stage.  The  key  element  of  this  is  the 
notion  of  useless  nodes,  Le..  nodes  that  do  not  connect  to  any  transistor  gates  and  hence  whose  values 
arc  irrelevant.  Such  nodes  commonly  occur  in  a  pulldown  path  containing  more  than  one  transistor, 
such  as  the  node  marked  by  an  asterisk  in  figure  4.18(a). 


(a)  nMOS  logic  gale  (b)  pulldown  after  removing  useless  node 

Figure  4.18.  Remo  •<  g  useless  nodes  from  a  stage 

Section  3.4.4  mentions  that  a  pulldown  with  more  than  one  transistor  is  electrically  equivalent  to  a 
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single- transistor  pulldown  of  ihe  appropriate  si/e.  This  suggests  dial  such  a  pulldown  can  he  replaced 
by  a  circuit  like  die  one  shown  in  figure  4.18(b).  All  die  nodes  in  die  pulldown  except  die  output  and 
CM)  arc  eliminated,  and  all  die  pulldown  transistors  are  replaced  by  a  single  transistor.  The  gale  value 
of  die  single  transistor  is  die  logical  conjunction  of  die  \ allies  of  the  gates  of  the  original  pulldown 
chain.  In  fact.  KSIM  uses  a  compact  representation  for  die  generali/ed  MOS  gate: 


V  first  pulldown 


« 

>  second  pulldown 

f  third  pulldown 


j'  pullup 

Figure  4.19.  Efficient  interna!  representation  of  an  n.MOS  logic  gale 

All  transistors  and  nodes  that  make  up  the  gate  arc  eliminated,  and  the  resulting  gate  structure  is 
associated  with  the  output  node.  The  output  can  still  connect  to  other  transistors  that  are  not 
recognized  as  part  of  a  logic  gate;  only  those  transistors  that  implement  a  MOS  logic  gate  are 
compressed.  I  he  resistance  parameters  of  a  gate  structure  are  computed  very  cfficiendy  by  RSIM  — 
many  times  more  quickly  than  the  analysis  of  the  equivalent  network. 

The  compression  of  gate  circuits  into  the  compact  internal  representation  also  results  in  a 
considerable  space  saving.  Somewhere  between  40%  and  80%  of  die  transistors  in  most  circuits  are 
eliminated  when  the  gate  structures  arc  built.  This  rcsulung  simulation  runs  roughly  twice  as  fast  as 
the  uncompressed  network,  lhis  optimization  is  probably  the  single  largest  contributor  to  the  ability 
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of  KSl\i  it)  deal  with  very  large  MOS  circuits. 

4.3.  Fscupc  mechanisms 

Previous  sections  of  this  chapter  introduced  mechanisms  that  allow  the  user  to  adjust  tire 
operation  of  the  simulator  as  a  whole.  There  are  occasions,  however,  when  a  large-scale  adjustment  is 
inappropriate,  and  onl>  the  predictions  for  a  single  node  need  correction.  Ihis  section  discusses 
several  "escape"  mechanisms  provided  by  RSIM  for  adjusting  the  predictions  for  small  groups  of  nodes 
and  transistors. 

The  modifications  discussed  here  arc  ad  hue  in  nature:  their  motivation  arise'  from  purely 
practical  considerations.  'Hie  mechanisms  are  not  intended  to  allow  wholesale  changes  in  the 
simulation  computation,  but  are  provided  so  the  designer  can  correct  particularly  egregious  or  far- 
reaching  errors  in  the  simulation  of  specific  circuits.  Since  the  mechanisms  treat  the  sy  mptoms  and  not 
the  disease,  their  effectiveness  is  limited  to  local  improvements. 

The  are  four  user-adjustable  parameters  for  each  node: 
vi  ow  the  logic  low  threshold  for  the  node  (specified  in  normalized  voltage  units). 
vhigh  chc  logic  high  direshold  for  the  node  (specified  in  normalized  voltage  units). 

TPI.II  the  low-to-high  transition  time  for  the  node  (specified  in  time  quanta). 

TPHt.  the  high-to-low  transition  lime  for  the  node  (specified  in  time  quanta). 

By  adjusting  the  logic  thresholds  with  vt.ow  and  vhigh.  the  user  can  prevent  predictions  of  X  values 
for  circuits  with  non-standard  pullup/pulldown  ratios.  This  can  be  useful  in  a  circuit  where  a  node’s 
voltage  swing  is  reduced  for  performance  or  other  reasons  (for  example,  in  input  buffers  or  bit-lines  of 
dynamic  memory  circuits). 

The  transition  time  parameters  force  the  timing  of  all  the  node's  transitions.  These  parameters 
allow  adjustment  of  the  timing  of  critical  nodes  to  agree  with  predictions  of  circuit  analysis  programs. 
Clocks,  for  example,  often  arc  generated  by  special  circuitry  designed  to  drive  the  a  capacitive  load. 
Intricate  timing  chains  involving  bootstrapping,  etc.  increase  the  speed  of  clock  distribution  circuitry  to 
acceptable  levels.  Most  of  these  circuit  techniques  arc  beyond  rsim's  ability  to  predict  accurately; 
incorrect  predictions  for  critical  signals  can  throw  off  the  whole  simulation.  Using  the  transition  time 
parameters,  die  designer  can  force  the  rise  and  fall  times  of  critical  signals  to  their  proper  values, 
improving  the  quality  of  the  remainder  of  the  simulation. 

It  is  obv  ious  how  transition  time  parameters  affect  the  scheduling  of  events,  but  what  about  the 
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iirrung  of  a  node  connected  directly  to  a  forced  node  by  a  sourcc/drain  connection?  A  workable 
scenario  treats  a  node  with  forced  umings  as  an  input,  setting  its  dynamic  resistance 
(ft, jj.  Rgr,j.  and  ft, )  and  capacitance  parameters  to  zero.  (Note  that  the  \alue  calculation,  which  uses 
static  resistances,  is  unaffected.)  The  transition  time  for  a  node  connected  to  a  forced  node  is  the  sum 
of  the  given  transition  time  for  die  forced  node  and  the  RC  time  constant  of  the  padt  from  the  forced 
node. 


(a)  ongina!  circuit  with  forced  node  (b)  equivalent  network  for  node  B 

Figure  4.20.  How  forced  timings  affect  neighboring  nodes 

If  a  node  is  connected  to  more  than  one  forced  node,  the  smallest  forced  time  constant  is  used. 
Neighbors  of  forced  nodes  always  change  value  after  the  forced  node  —  a  reasonable  prediction. 

A  much  more  powerful  mechanism  for  forcing  the  desired  prediction  is  modification  of  the 
circuit  itself,  replacing  troublesome  configurations  with  others  drat  simulate  correctly.  Piecemeal 
modification  of  a  large  circuit  can  quickly  lead  to  a  loss  of  confidence  in  the  simulation  results, 
especially  if  the  replacements  are  performed  in  a  haphazard  manner.  On  the  other  hand,  the 
systematic  idenufication  and  replacement  of  specific  subcircuits,  drawing  from  a  library  of  approved 
replacements,  offers  the  opportunity  to  improve  simulation  accuracy  for  common  subcircuits. 

The  pattern  matching/rcplaccmcnt  program  match,  written  by  John  Her  (Uer83],  provides  an 
efficient  way  to  systematically  modify  pieces  of  large  circuits.  The  circuit  to  be  modified  is  identified 
by  a  pattern  specifying  a  prototype  subcircuit  Hach  node  in  the  prototype  is  given  a  type  which 
controls  what  nodes  it  matches  in  the  actual  circuit: 

(1)  matched  only  by  a  circuit  node  with  exactly  the  same  connections  specified  in  the 
pattern. 

(2)  matched  by  a  circuit  node  with  at  least  the  connections  specified  in  the  pattern, 
but  the  circuit  node  may  also  have  other  connections. 

(3)  matched  by  a  node  with  die  same  name. 
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The  pattern  indicates  winch  promts  pc  nodes  attach  to  each  transistor  in  the  prototype.  ar.d  can  further 
constrain  die  match  by  giving  an  explicit  si/e  or  resistance  for  each  prototype  transistor.  The 
replacement  can  modify  parameters  of  existing  circuit  components,  and  add  or  delete  components. 
For  example,  the  following  figure  shows  a  pattern  and  replacement  for  the  bootstrap  circuit  discussed 
in  section  3.2. 


Ope  (3)  nodes 


XX 


X 


type  (2)  node 

(a)  pattern  (b)  replacement 

Figure  4.21.  Pattcrn/replacement  for  bootstrap  circuit 

MATCH  is  regularly  used  in  at  least  one  industrial  environment  to  improve  the  predictions  of 
RSIM.  Her  suggests  other  uses  for  the  program:  gathering  of  circuit  statistics,  identifying  common 
circuit  errors,  and  implementing  circuit  changes  (HCO's)  without  requiring  the  regeneration  of  the 
entire  nctlist.  match  has  proved  to  be  a  handy  tool. 

4.4.  An  evaluation  of  RSIM 

RSIM  has  simulated  a  large  number  of  designs,  both  in  university  and  industrial  environments. 
Industrial  designers  arc  attracted  to  RSIM  because  of  its  ability  to  correctly  predict  the  functionality  of 
most  MOS  circuits  without  designer  intervention  —  a  unique  capability  in  a  logic  simulator  efficient 
enough  to  accommodate  large  designs,  rsim's  timing  estimates  are  helpful  in  locating  gross  timing 
errors  in  industrial  designs,  but  the  conservative  nature  of  the  estimates  make  them  unsatisfactory  for 
fine  tuning  critical  circuitry.  In  short.  RSIM  allows  the  verification  of  large  industrial  designs,  at  a  level 
of  detail  not  obtainable  with  other  simulators. 

Timing  estimates  appear  to  be  more  important  for  academic  users  who.  more  often  than  not. 
..c  not  paid  as  much  attention  to  the  performance  of  each  individual  circuit  component.  RSIM  makes 
a  good  breadboard  for  locating  performance  bottlenecks  and  experimenting  with  potential  solutions. 


Since  transition  timings  automatically  reflect  output  loadings  and  device  si/cs,  the  naive  user  s 
attention  is  focused  on  critical  portions  of  the  design.  RSIM  is  a  good  companion  for  the  novice 
designer  because  of  its  ability  to  qualitatively  model  much  of  die  behavior  of  MOS  circuitry. 


RSIM  advances  the  state  of  the  art  of  simulation  in  several  ways.  1'he  linear  model  embodied  by 
rsim  is  a  systematization  of  a  common  rule-of-thumb  for  estimating  circuit  performance.  The 
simulator  was  originally  developed  simply  to  automate  the  calculation  of  RC  time  constants,  and  to 
reap  the  benefits  of  applying  the  same  timing  criteria  uniformly  to  the  entire  circuit.  '1'he  analysis  of 
propagation  delay  in  Chapter  3  justifies  the  use  of  the  linear  model  as  a  simple  approximation  and 
extends  the  rule-of-thumb  to  include  the  affects  of  the  input  waveform  timings  on  gate  propagation 
delay.  RSIM  breaks  new  ground  by  combining  logic-level  simulation  with  the  ability  to  automatically 
estimate  transition  times  directly  from  the  electrical  properties  of  the  circuit  components.  While  the 
results  are  less  accurate  than  circuit  analysis,  the  designer  is  compensated  by  an  increase  in 
computation  speed  by  several  orders  of  magnitude.  RSIM  represents  a  first  cut  at  a  stylized  form  of 
circuit  analysis  which  attempts  to  model  the  significant  effects  at  far  less  cost  than  traditional  analysis 
techniques.  The  proven  utility  of  rsim  augurs  well  for  further  developments  in  the  area  between  logic 
simulation  and  circuit  analysis. 

The  introduction  of  intervals  to  characterize  the  operation  of  circuit  components  controlled  by 
X-valucd  signals  is  a  novel  technique  for  merging  electrical  analysis  with  the  logical  concept  of 
unknown  signal  values.  The  use  of  intervals  allows  one  to  easily  compute  the  electrical  consequences 
of  unknown  node  values,  resulting  in  predictions  more  satisfactory  than  those  obtainable  from 
conventional  logic  simulators  or  circuit  analysis  programs. 

There  is.  of  course,  plenty  of  room  for  improvement  in  rsim!  For  example,  interconnect  is  not 
modeled  at  all.  As  a  circuit's  physical  size  decreases,  the  transmission  delay  introduced  by  the 
interconnect  is  as  large  as  the  propagation  delay  of  the  gates.  Certain  layout  techniques,  such  as  a 
long  run  of  polysilicon,  are  inherently  slow  and  might  become  the  fatal  flaw  in  an  otherwise  carefully 
tuned  design.  [Pcnficld81]  offers  some  computationally  reasonable  models  for  predicting  transmission 
delays;  these  models  are  well-suited  for  incorporation  into  RSIM.  His  analysis,  along  with  that  of 
[Horowit/83],  offers  some  insight  into  the  correct  modeling  of  pass  gales  and  distributed  capacitances. 
(The  lumped  approximation  used  by  RSIM  can  be  very  pessimistic.)  Along  the  same  lines,  the 
development  of  better  time  constants  for  charge-sharing  events  would  improve  the  modeling  of  circuits 
containing  both  large  and  small  capacitances. 


i 


Another  class  of  problems  is  introduced  by  the  one-pass  nature  of  the  computations.  In  order  to 
limit  the  amount  of  computation  needed  for  each  prediction,  the  algorithms  are  constrained  to  make 
only  one  pass  over  the  surrounding  network.  While  most  MOS  circuits  arc  trees,  and  hence  amenable 
to  a  one-pass  analysis,  circuits  that  contain  cycles  arc  not  handled  correctly,  The  proposed  solution  — 
choosing  a  single  path  through  the  cycle  to  represent  the  cycle's  resistance  —  is  definitely  ad  hoc 
performing  the  correct  serics/parallel  analysis  would  be  preferable. 

There  is  also  a  need  to  consider  the  effects  of  deviations  in  device  performance  from  that 
predicted  by  first-order  theory.  Some  effects  (channel  length  modulation,  body  effect,  short  channel 
effects)  might  best  be  handled  during  the  calibration  process.  Other  effects  (Miller  capacitance)  may 
lead  to  further  modifications  in  the  model  or  calculation  of  device  parameters  in  order  to  ensure 
conservative  predictions.  Finally,  there  is  the  possibility  that  work  on  waveform  bounding  [Wyatt83], 
which  seeks  to  obtain  closed-form  equations  for  the  waveform  of  each  node  of  a  circuit,  can  provide  a 
replacement  for  the  linear  model  presented  here. 
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CHAPTER  FIVE 


Simulation  Using  a  Switch  Network  Model 


If  a  designer  is  only  interested  in  the  logical  properties  of  a  circuit,  re..  those  properties 
independent  of  performance  issues,  it  is  possible  to  simplify  the  linear  model  of  the  previous  chapter 
even  further  by  modeling  each  transistor  as  an  on/off  switch  whose  state  is  determined  by  the  type  of 
transistor  and  the  state  of  its  gate  node.  This  chapter  discusses  the  switch  model  from  two  points  of 
view:  first,  as  a  special  ease  of  the  linear  model,  and  then  as  a  self-contained  model.  But  first,  a  small 
digression  on  the  representation  of  node  values  is  in  order. 

5.1.  Representing  node  values 

The  success  or  failure  of  a  logic-level  simulator  often  hinges  on  the  choice  of  the  set  of  possible 
node  values.  If  the  set  is  too  small,  the  actual  node  value  may  not  be  precisely  described  by  any  one 
of  the  available  values  and  the  simulator  must  choose  an  approximation.  Usually  the  approximation 
involves  some  variant  of  the  X  (unknown)  value  which  may  carry  logical  implications  beyond  what  the 
network  itself  imposes  —  such  a  choice  is  termed  cither  "conservative"  or  "pessimistic”  depending  on 
one's  point  of  view.  If  the  set  is  large,  it  becomes  difficult  to  establish  whether  the  simulator's 
calculations  arc  correct  in  all  eases.  Relying  on  the  accumulated  evidence  of  many  simulation  runs 
when  arguing  correctness  lacks  tire  rigor  that  leads  to  total  confidence  in  the  algorithm.  This  section 
develops  criteria  for  evaluating  a  set  of  node  values. 


J 
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Iherc  arc  three  major  influences  on  the  choice  of  the  node-value  set: 

(1)  the  need  to  report  node  values  to  the  user; 

(2)  die  need  to  determine  the  state  of  each  network  component  from  the  values  of 
its  terminal  nodes;  and 

(.1)  the  need  to  represent  intermediate  values  during  an  incremental  simulation 
calculation. 

If  only  the  first  two  influences  arc  considered,  a  three-value  set  —  0.  1.  and  Xf  —  will  suffice  for 
logic-level  simulation.  Users  and  component  models  cannot  reasonably  expect  more  information  than 
provided  by  this  set,  since  most  logic-level  algorithms  cannot  support  more  detailed  deductions  from 
arbitrary  MOS  networks  with  any  degree  of  accuracy.  It  is  the  third  influence  that  leads  to  all  the 
complication. 

Almost  all  logic  simulators  analyze  a  network  piece  by  piece,  modifying  their  estimates  for  node 
values  as  the  effect  of  each  piece  of  the  network  is  determined.  Until  the  new-value  computation  is 
completed,  the  intermediate  node  values  serve  as  accumulators  that  store  all  the  information  the 
simulator  has  about  the  effects  of  network  pieces  already  examined.  Thus.  distinct  values  are  needed 
for  all  qualitatively  different  intermediate  states:  c.g..  a  node  currently  at  logic  high  might  have  that 
value  because  examination  of  the  network  to  date  revealed  that  it  was  (i)  storing  charge,  (ii)  connected 
to  a  depiction  pullup,  or  (iii)  being  precharged  by  an  enhancement  device.  The  simulator  must 
distinguish  among  these  possibilities,  since  the  final  value  of  node  may  be  different  in  each  ease  if,  for 
example,  further  network  processing  discovers  a  pulldown  for  the  node.  The  exact  number  of  values 
needed  depends  on  the  details  of  the  simulation  computation;  most  simulators  full  into  one  of  the  two 
categories  discussed  below.  As  will  be  seen,  the  two  categories  arc  distinguished  by  their  approach  to 
X  values. 


i’ll  might  be  useful  to  distinguish  V.  an  unknown,  bul  legitimate  logic  value  (eg.,  the  output  of  a  pair  of  cross- 
coupled  inverters)  from  other  tvpcs  of  X  values  X'  values  arc  well  behaved  in  logic  operations,  for  example.  B  + 
"’ll  =  1  if  the  value  of  I)  is  ,Y.  bul  equals  X  if  the  value  of  It  is  X  Such  distinctions  might  be  important  during  ini¬ 
tialization  [StevensSJ]  describes  a  simulator  that  uses  this  distinction  to  improve  its  predictions  for  certain  simple  log¬ 
ic  circuits. 
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5.1.1.  Cross-product  value  sets 

One  intuitively  appealing  approach  to  choosing  a  set  of  node  values  is  to  think  of  each  \aluc  as 

having  several  distinct  attributes  chosen  from  independent  categories.  1'hus.  for  example,  one  might 

characterize  a  node's  logic  state  and  the  "strength"  of  the  value  separately.  The  logic  suite  is  usually 

one  of  0,  1,  or  X:  sometimes  a  high-impedance  suite,  /..  is  included  to  represent  the  output  of  tri-state 

logic  gates  [Flakc80,  Holt81].  The  strength  indicates  what  sort  of  network  connection  exists  between 

the  source  if  the  value  and  the  current  node: 

input.  Node  is  a  designated  input  (e.g.,  vdd  or  G\n).  The  value  of  an  input  node  can 
only  be  changed  by  explicit  simulator  commands  —  the  assumption  is  that  inputs 
supply  enough  current  to  be  unaffected  by  connections  (possibly  shorts  to  other 
inputs)  made  by  transistor  switches. 

driven.  Node  is  connected  by  dosed  switches  to  inputs  or  other  driven  nodes.  Driven 
nodes  can  affect  the  value  of  weak  or  charged  nodes  without  being  affected 
themselves,  but  may  be  forced  to  an  X  state  if  shorted  to  an  input  or  driven  node  that 
has  a  different  logic  level. 

weak.  Node  is  connected  to  an  input  node  by  a  depletion-mode  transistor.  Weak 
nodes  can  affect  charged  nodes  without  being  affected  themselves,  but  arc  forced  to  a 
driven  state  when  connected  to  another  driven  or  input  node.  A  weak  node  returns 
to  the  appropriate  weak  suite  when  completely  disconnected  from  driven  or  input 
nodes  (/.£>.,  a  weak  node  can  never  enter  the  charged  suite). 

charged.  Node  is  connected,  if  at  all,  only  to  other  charged  nodes.  Until  reconnected 
to  some  other  part  of  the  network,  charged  nodes  maintain  their  current  logic  suite 
indefinitely  (charge  storage  with  no  decay).  Ihis  is  the  default  state  of  all  non-wcak 
nodes. 

Other  strengths  can  be  included  to  model  the  effects  of  differently  sized  transistors,  node  capacitors. 


etc. 


The  plethora  of  9-.  12-.  and  16-state  logic  simulators  (see  [Ncwton80J)  use  values  chosen  from 
the  set  formed  by  the  cross  product  of  the  various  value  attributes.  For  example,  a  9-staic  simulator 
might  use 


logic  suite 


0 

1 

X 

driven 

1)1. 

l)H 

DX 

strength  weak 

Wl. 

WH 

WX 

charged 

Cl. 

CH 

ex 

/ 


Note  that  in  this  formulation,  X  is  treated  as  sort  of  a  third  logic  value  on  a  par  with  0  and  1; 
presumably  X's  arc  generated  by  the  simulator  to  model  invalid  combinations  of  0's  and  l's.  The 
implication  is  that  one  can  determine  if  a  value  should  be  X  without  any  coi  idcration  of  strengths. 
(Remember  that  the  main  motivation  of  forming  the  cross  product  is  that  the  various  attributes  are 
independent).  ITiiscan  lead  to  pessimistic  predictions,  as  is  shown  in  an  example  below. 

It  is  useful  to  order  the  possible  signal  values  according  to  their  relative  strengths.  Intuitively, 
value  A  is  stronger  than  value  B.  written  A  >  B,  if  value  A  predominates  when  both  signals  are  shorted 
together.  Of  course  there  arc  situations  where  neither  value  emerges  unscathed  —  for  example,  when 
two  signals  of  the  same  strength  but  opposite  logic  states  are  shorted  —  in  which  case  neither  signal  is 
saiu  to  be  stronger  than  the  other.  The  notion  of  strength  can  be  formalized  using  a  lattice  of  node 
values,  for  example: 


DX 

/  \ 
DH  DL 
\  / 
WX 

^  \ 
WU  WL 

ex 

/  \ 
CH  CL 
\  / 

X 


Figure  5.1.  Lattice  of  node  values  for  a  9- state  simulator 


The  node  value  \  is  used  to  represent  the  null  signal,  no  signal  at  all. 

Referring  to  the  lattice,  given  two  values  A  and  B,  A  >  B  if  A  is  not  equal  to  B  and  there  is  an 
upward  path  through  the  lattice  that  starts  at  B  and  reaches  A.  For  example 
OX  is  greater  than  all  other  signals, 

OH  is  greater  than  WL.  but 
WL  is  not  greater  than  WH. 

The  least  upper  bound  (l.u.b.)  of  two  values  A  and  B.  written  A  U  B,  is  defined  to  be  the  value  C 
such  that 


(i)  C  >  A 

(ii)  C  >  B 

(iii)  for  every  value  O,  if  0  >  A  and  O  >  B.  then  0  >  C. 
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Examining  the  lattice  above,  it  is  easy  to  see  that  the  I.u.b.  always  exists  for  any  two  node  values. 
Note  that  if  A  >  H.  A  U  B  =  A;  die  I.u.b.  captures  our  intuition  about  what  should  happen  when  two 
signals  of  different  strengths  arc  shorted  together.  With  the  appropriate  placement  of  X  values  in  the 
lattice,  the  I.u.b.  can  be  used  to  predict  the  outcome  when  any  two  signals  are  shorted. 

The  interpretation  of  X  values  captured  by  the  lattice  above  is  quite  appropriate  for  describing 
the  logic  state  of  nodes  involved  in  a  short  circuit: 


] 


1 


DX  =  DH  U  DL 


Figure  5.2.  A  short  circuit  leading  to  an  X  value 


Assuming  the  two  transistors  arc  the  same  size,  the  middle  node's  value  is  the  result  of  merging  two 
equal  strength  signal  values.  According  to  our  lattice,  this  merger  yields  an  X  value.  Short  circuits  are 
the  mechanism  by  which  X's  are  introduced  into  a  network  previously  containing  only  0's  and  Ts. 

However,  the  situation  is  not  as  straightforward  when  one  considers  connections  formed  by 
transistors  with  a  gate  signal  of  X.  The  resulting  values  cannot  be  computed  directly  using  the  U 
operation  on  the  source  and  drain  signals,  and  once  that  hurdle  has  been  surmounted,  there  is  some 
difficulty  in  choosing  which  value  to  use  from  the  cross-product  value  set.  Consider  the  following 
analysis  of  a  node  with  stored  charge  and  connection  to  two  transistors. 


Figure  5.3.  Incremental  analysis  of  a  simple  network 


Before  any  connections  to  the  node  have  been  discovered  (figure  5.3(a)),  die  node  maintains  the 
charge  of  its  last  driven  value,  say,  logic  low:  the  simulator  would  assign  the  node  a  value  of  CL 
After  the  first  transistor  is  discovered  (figure  5.3(b)).  the  facts  change: 
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(i)  Because  ul  the  \  on  the  gale  of  tlie  transistor,  one  cannot  he  certain  wh.it  lype 
of  connection  exists  between  the  node  in  question  and  the  1)11  on  the  other  side 
of  the  transistor.  Ihus.  the  new  logic  suite  of  the  node  should  be  X. 

(ii)  the  strength  of  the  new  value  is  uncertain,  but  clearly  "weak"  or  "charged" 
would  be  inappropriate  since  they  understate  the  strength  in  the  case  where  die 
unknown  gate  value  was  actually  a  1. 

Since  a  weak  or  charged  value  could  be  overridden  by  an  enhancement  pulldow  n  discovered  later  on. 
mistakenly  leading  to  1)1.  value,  the  simulator  has  no  choice  but  to  select  a  driven  value.  The 
conclusion:  DX  is  die  only  state  available  dial  handles  all  eventualities  in  a  conservative  fashion.  Of 
course,  with  knowledge  of  what  the  rest  of  the  network  contains,  the  simulator  could  make  a  more 
intelligent  choice,  but  this  is  beyond  the  ken  of  an  incremental  algorithm. 

By  the  time  a  connection  to  a  depletion  pullup  is  discovered  (figure  5.3(c)).  the  die  has  been  cast: 
the  previously  chosen  1)X  value  overrides  any  contribution  by  the  pullup  (DX  U  anything  =  DX). 
While  this  answer  is  not  wrong,  it  is  more  conservative  than  required;  at  this  point  the  logic  state  of 
the  node  should  be  1.  The  pullup  guarantees  a  logic  1  with  the  unknown  connection  to  DH.  only 
leaving  doubts  about  the  strength  of  the  value  (somewhere  between  weak  and  driven). 

Proponents  of  cross-product  value  sets  might  point  out  that  the  analysis  would  have  generated  a 
different  answer  if  the  transistors  had  been  discovered  in  a  different  order.  The  somewhat 
embarrassing  ability  to  produce  two  different  answers  for  the  same  network,  both  correct,  is  caused  by 
the  fact  that  the  merge  operation  is  not  associative  when  connections  arc  made  through  transistors 
with  X  gates.  In  fact,  most  incremental  simulators  that  use  cross-product  value  sets  perform  the 
incremental  analysis  in  an  order  that  yields  a  reasonable  answer  on  the  example  above.  Unfortunately, 
it  is  usually  possible  to  confound  them  with  more  complex  circuits  containing  X's;  while  such  circuits 
arc  not  commonplace,  they  often  crop  up  during  network  initialization  when  all  nodes  start  off  at  X.t 

In  conclusion,  it  is  possible  to  build  effective  simulators  using  cross-product  value  sets;  however, 
they  can  make  conservative  predictions  on  circuits  that  contain  X's.  In  practice,  this  leads  to  difficulty 
in  initializing  some  circuits  and  to  occasional  over-propagation  of  X  values. 


f|Brvantftl)  suggests  using  an  incremental  calculation  only  for  subnetworks  of  nodes  connected  by  non-X  transistors. 
Once  these  values  have  been  computed,  a  separate  compulation  merges  subnets  connected  by  X  iransislois.  Since 
this  compulation  has  global  knowledge  of  the  network,  it  can  avoid  the  problems  mentioned  here 
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5.1.2.  Intertill  value  sets 

The  difficulties  with  the  cross-product  value  set  arise  because  of  its  separation  of  the  notion  of 
strength  and  logic  suite.  Once  a  node  value  is  set  to  an  X  value  at  some  strength,  it  cannot  return  to  a 
normal  logic  suite  unless  overpowered  by  a  stronger  signal:  if  a  node  is  set  to  the  strongest  X  value,  it 
stays  at  that  value  for  the  rest  of  the  computation.  As  in  the  example  above,  this  leads  to  conservative 
predictions  when  the  strongest  X  value  is  chosen  because  of  the  lack  of  suitable  alternatives. 
Specifically  die  difficulty  came  about  because  the  simulator  had  to  pick  the  highest  strength  to  be  on 
the  safe  side;  there  was  no  value  available  that  would  indicate  that  the  logic  low  signal  which 
contributed  to  the  intermediate  X  value  was  of  very  low  strength  and  hence  might  be  overridden  by 
later  network  components. 

This  suggests  a  different  approach  to  constructing  the  set  of  possible  nodes  values,  one  based  on 
intervals.  First  one  starts  with  a  set  of  node  values  with  a  range  of  strengths  and  0/1  logic  states,  for 
example,  the  six  non-X  suites  used  above:  {DH,  DL,  WH.  WI„  CH,  CL}.  Then  additional  values  are 
introduced  by  forming  intervals  from  two  of  the  basic  values;  if  there  are  six  basic  values,  then  there 

are  (j)  =  15  such  intervals,  leading  to  a  total  of  21  node  values  altogether. 

Intervals  represent  a  range  of  possible  values  for  a  node.  The  size  of  the  range  is  related  to  the 
strength  of  its  end  points.  If  we  arrange  the  six  basic  values  in  a  spectrum  ranging  from  the  strongest 
1  (DH)  to  the  strongest  0  (DL),  the  possible  node  values  can  be  shown  graphically: 
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Figure  5.4.  The  2/  node  values  of  the  interval  value  set 


Intervals  that  do  not  cross  the  center  line  correspond  to  a  valid  logic  suite:  intervals  above  the  line 
represent  logic  high  values,  and  those  below  the  line,  logic  low.  Intervals  that  cross  die  center  line 
represent  X  values.  (  The  X  values  of  die  previous  section  correspond  to  intervals  with  equal  strength 
end  points:  DX  =  (DL.DIIj.  WX  =  |WI.,WH],  and  CX  =  [CL.CIIj.)  Thus,  X  values  result  from 
ambiguity  about  which  of  die  base  values  best  represents  die  true  node  value.  As  will  be  seen  below. 
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this  is  more  satisfactory  than  thinking  of  X  .is  a  third,  independent  logic  suite. 

When  tlte  simulator  merges  two  node  values.  it  chooses  the  smallest  inters al  that  covers  all  tire 
possible  node  states.  However,  unlike  the  cross-product  value  set.  the  interval  set  can  represent  X 
values  without  loosing  track  of  the  strengths  of  the  signals  that  lead  to  the  X  values.  Consider  me 
problems  raised  by  figure  5.3(b).  Using  an  interval  value  set.  the  resulting  node  value  is  naturally 
represented  by  (Cl  ,I)H],  an  interval  that  corresponds  to  an  X  logic  suite.  When  tlte  pullup  is 
discovered  (figure  5.3(c)).  the  simulator  can  narrow-  this  interval  to  [WH.DH]  since  the  pullup 
overpowers  the  weaker  Cl.  value.  This  corresponds  to  a  logic  high  signal  —  a  sensible  answer. 

An  algebra  for  calculating  the  result  of  merging  two  interval  node  values  is  developed  in 
[Klake83]:  a  different  approach  is  adopted  in  section  5.4.1  where  a  detailed  description  of  the  merge 
operation  can  be  found.  With  an  interval  value  set,  the  merge  operation  is  commutative  and 
associative,  and  the  network  can  be  processed  in  any  order  without  affecting  the  final  node  values, 
'[he  extra  12  values  introduced  by  the  interval  value  set  are  needed  to  carry  sufficient  information 
about  how  the  current  value  was  determined,  to  ensure  that  the  final  answer  is  independent  of  the 
processing  order. 

The  examples  above  suggest  the  following  conjecture  about  the  correct  size  of  a  node  value  scl 

2s 

Assuming  that  one  has  s  different  signal  strengths  and  two  logic  levels  (0  and  1).  then  2s  +  ( j )  values 

arc  needed  to  ensure  that  the  signal  algebra  is  well-formed.  In  simulators  with  too  few  states,  some 
states  take  on  multiple  meanings;  for  example,  the  OX  value  in  the  cross-product  value  set  is  used  to 
describe  nodes  that  fall  into  5  separate  values  .*•  interval  value  set: 

(1)1.1)11]  [W1..DH]  [CI..DH]  [WH.DI.J  [CH.DI.] 

This  lack  of  expressive  power  on  the  part  of  cross-product  value  sets  is  what  leads  to  pessimistic 
predictions  for  node  values  in  certain  networks. 

5.2.  Developing  the  switch  model 

Switch  models  of  MOS  circuits  are  of  interest  since  a  switch  is  the  simplest  component  dial  meets 
the  criteria  outlined  in  Chapter  1:  switches  arc  inherently  bidirectional  and  the  logic  operations  they 
implement  can  be  computed  with  acceptable  efficiency  in  large  networks. 

Randy  Rryant  [Hryant79],  one  of  the  first  to  apply  switch-level  simulation  to  MOS  transistor 
networks,  viewed  the  network  as  divided  into  equivalence  classes.  Two  nodes  arc  equivalent  if  they 
are  connected  by  a  path  of  closed  sw  itches.  Nodes  in  the  same  equivalence  class  as  vun  arc  assigned  a 


logic  high  state;  those  equivalent  to  GM>.  .1  logic  low  suite.  A  pulltip  (a  depletion-mode  transistor 
which  is  always  on  in  die  switch  model)  gives  the  node  to  which  n  is  attached  .1  special  property:  if  an 
equivalence  class  of  nodes  does  not  contain  either  vni)  ort.M).  hut  does  contain  a  pulled-up  node,  all 
the  nodes  in  the  class  are  assigned  a  logic  high  state.  Final!),  if  an  equivalence  class  contains  neither 
an  input  nor  a  pulled-up  node,  it  is  "storing  charge"  and  maintains  whatever  logic  suite  it  had  last. 

The  simulator  based  on  this  switch  model  iteratively  calculates  the  equivalence  classes  for  all  the 
nodes  in  the  network  until  two  successive  calculations  return  the  same  result  (i.c..  no  nodes  change 
suite).  Unfortunately  tins  pure  switch  model  has  some  deficiencies: 

(i)  Switches  in  indcto-minatc  states  (those  with  "gate"  nodes  of  X)  make  the 
equivalence  calculation  somewhat  more  difficult.  The  desired  computation  is 
inefficient  since  it  involves  a  combinatorial  search :  all  combinations  of  on/off 
assignments  to  switches  in  die  X  state  need  to  be  investigated  to  determine 
whether  a  switch's  state  makes  a  difference.  If  the  network  is  unaffected  by  a 
switch's  suite,  the  switch  can  be  ignored;  otherwise  all  affected  nodes  are 
assigned  the  X  suite. 

(ii)  The  equivalence  calculation  is  much  more  time  consuming  than  necessary  since  it 
deals  with  die  whole  circuit  rather  than  focusing  only  on  die  parts  which  change. 

(iii)  In  certain  circuits  transistor  "si/e"  is  important,  and  the  notion  of  si/e  cannot  be 
expressed  in  the  pure  switch  model.  A  puilup  is  a  trivial  example:  viewed  as  a 
switch  it  was  always  on,  but  more  "weakly"  than  die  "strong"  switches  in  the 
pulldown.  The  si/c  of  transistors  also  determines  the  "strengdi"  of  various  driver 
circuits:  for  example,  it  is  common  for  die  write  amplifier  of  a  static  memory  to 
force  a  value  into  a  memory  cell  by  simply  overpowering  die  weaker  gate  in  the 
cell  itself. 

The  remainder  of  this  chapter  investigates  different  approaches  to  solving  die  first  two  problems 
outlined  above.  The  third  problem  is  addressed  with  some  success  by  KSIM  which  uses  si/c 
information  not  only  to  calculate  node  values  but  to  provide  timing  information  as  wcll.f 

The  following  sections  present  two  different  formulations  of  the  sw  itch  model: 

•  a  model  where  each  node  value  is  computed  via  a  "global"  examination  of  the 
network.  If  the  network  has  no  explicit  feedback,  each  node  value  is  computed 
exactly  once,  but  this  calculation  is  more  expensive  dian  die  vine  below. 

•  a  model  based  on  "local"  interactions  where  die  simulator  examines  die  source  and 
drain  nodes  of  each  transistor  and  updates  the  state  of  one  or  both  nodes.  The 
cxamiiiation/iipd.nc  process  continues  until  there  are  no  further  updates  to  be 
made.  i.c..  the  network  has  "relaxed"  into  its  final  state.  Under  this  scheme  each 
calculation  is  trivial  but  <1  node  value  might  be  computed  more  than  once  even 


t  Bryant  [BrvnniSI]  proposes  extending  the  swiich  model  10  include  a  hierarchy  of  swii.h  st/cs.  a  gcnciali/ation  of  the 
ad  hoc  solution  for  pullups  lltx  Ihcsis  develops  an  algebra.  in  ill c  spiril  of  Boolean  abebrn.  lor  dealing  formally  vvnh 
such  ncl  works 
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when  there  is  no  explicit  feedback  in  the  circuit. 

I  SIM  (the  author's  switch-level  simulator)  is  a  h>brid  of  these  two  formulations.  I  SIM  implements  a 
global  noJc-\ulue  calculation  using  a  nodc-\a)uc  representation  close  to  die  one  used  by  the  local 
simulator.  This  results  in  a  calculation  very  similar  to  that  implemented  by  RSIM.  except  that  abstract 
"logical"  resistances  (Rcjj  =  0,  1,  and  00)  are  substituted  for  the  "real"  resistances  used  in  RSIM. 
Since  this  type  of  simulation  algorithm  is  discussed  at  length  in  Chapter  4.  it  will  not  be  pursued  here. 
Instead,  the  remainder  of  this  chapter  focuses  on  lire  new  formulations  introduced  above. 

The  local  formulation  is  attractive  because  it  appeals  to  our  intuition  about  how  transistors  really 
work.  The  high  degree  of  potential  parallelism  in  the  update  calculation  makes  it  a  very  attractive 
algorithm  for  many  of  the  new  parallel  architectures  now  under  development;  the  combination  of 
parallel  hardware  and  intrinsically  parallel  algorithms  may  be  the  key  to  overcoming  the  capacity 
limitations  of  current  simulation  techniques. 

5.3.  The  global  switch  model 

Ihe  global  simulator  calculates  a  node's  value  by  computing  the  effect  of  each  input  on  the  node 
of  interest.  The  simulation  is  global  in  that  each  node  value  is  based  directly  on  the  values  of  the 
inputs  to  which  it  is  connected.  Thus,  the  values  of  non-input  nodes  do  not  enter  into  the 
computation.  This  means  that  0.  1,  and  X  will  suffice  as  final  node  values;  a  node  state  need  only 
capture  the  logic  state  of  the  node  and  no  strength  information  is  necessary. 

5.3.1.  Node  values  in  the  global  switch  model 

Fach  transistor  switch  in  the  network  is  assigned  a  state  determined  from  the  transistor's  type 
and  the  current  value  of  its  gate  node.  This  state  models  the  switch-like  qualities  of  the  source-drain 
.onncction  without  trying  to  capture  any  more  detailed  information  about  the  connection  —  a 
simplification  of  the  linear  model  presented  in  earlier  chapters. 

The  state  of  a  transistor  switch  summarizes  the  type  of  connection  that  exists  between  its  source 

and  drain  nodes.  For  MOS  circuits,  the  possible  switch  states  are: 

open  no  connection,  the  state  of  a  non-conducting  n-channcl  (gate  =  0)  or  p- 
channcl  (gate  =  1)  transistor. 

dosed  source  and  drain  shorted,  die  slate  of  a  conducting  n-channcl  (gate  =  1) 
or  p-channel  (gate  =  0)  transistor. 

unknown  uncertain  connection  between  source  and  drain,  the  state  of  an  n-  or  p- 
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channel  transistor  whose  gate  is  X. 

weak  the  stale  of  a  depletion  transistor.  Depletion  deuces  are  always  assigned 
tins  state,  regardless  of  the  suite  of  their  gate  nodes. 

Ihe  relationship  between  a  switch's  state,  its  types.  and  ns  gate  s dine  is  summarized  in  the  following 

figure. 
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Figure  5.5.  Switch  shire  as  a  function  of  transistor  type  and  gate  voltage 


In  the  global  simulator,  the  value  of  a  node  is  determined  by  the  inputs  to  which  it  is  connected 
and  the  states  of  the  intervening  sw  itches.  During  the  calculation  of  a  node's  value,  the  simulator  uses 
the  interval  node-value  set  presented  in  figure  5.4.  When  the  calculation  is  complete,  the  resulting 
interval  is  used  to  determine  the  final  logic  state  of  the  node,  using  the  following  table. 


final  logic  state  =  0 

final  logic  slate  =  / 

final  logic  state  =  X 

CL 

DH 

IDH.CL] 

[CL.WI.) 

IDH.WH] 

(DH.WIJ 

[Cl.DLJ 

(DH.CH] 

[DH.Dl.] 

WL 

WH 

fWH.Cl.j 

[WL.DL] 

[WH.CH] 

[WH.WL] 

DL 

CH 

[WH.DL] 

[CH.C1J 

[CH.WL] 

[CH.DL] 

Table  5.1.  Relationship  between  final  logic  state  and  computed  interval  value 

The  calculation  of  a  node's  value  begins  by  discovering  all  the  inputs  which  can  be  reached  from  the 
node  by  paths  of  closed,  weak,  and  unknown  switches.  If  no  inputs  can  be  reached,  the  final  logic 
suite  of  the  node  is  determined  by  a  charge  sharing  calculation  described  in  the  next  section.  If  one  or 
more  inputs  can  be  reached,  their  contribution  to  the  node's  value  is  determined  by  an  incrcmcnuil 
calculation  which  shuts  at  the  inputs  and  works  its  way  back  toward  the  node. 


The  value  of  a  logic  low  input  is  1)1.:  the  value  of  a  logic  high  input  is  DH.  As  the  calculation 
works  bock  toward  the  node  of  interest,  it  computes  an  effective  value  that  indicates  the  effects  of 
intervening  switches  on  the  original  input  value.  The  effect  of  a  switch  on  a  value  it  transmits  is 
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specified  h>  the  s witch  function: 
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figure  5.6.  f'ffectivc  value  of  an  input  after  passing  through  a  switch 


Hie  effect  of  a  switch  on  a  value  is  a  function  of  the  value  and  the  switch's  state: 
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Table  5.2.  switchfa.  value)  as  a  function  of  a  and  value 


A  new  value,  A.  is  introduced  to  describe  die  value  transmitted  by  an  open  (non-conducting)  switch. 
Lc..  no  value  at  all.  Ihc  value  A  is  weaker  than  CH  or  Cl.,  and  corresponds  to  a  logic  state  of  X. 

When  two  paths  merge,  their  effective  value  is  determined  using  the  U  operation  introduced  in 


section  5.1.1. 


(a)  two  values  10  merge  (b)  values  including  cffccl  of  swilehes  (c)  merged  value 

Figure  5.7.  Merging  the  values  for  two  paths  which  join 

'Hie  U  operation  is  defined  using  the  lattice  shown  in  the  following  figure. 
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Figure  5.8.  Lattice  for  interval-node  value  set 

Following  the  procedure  outlined  in  figure  5.7.  the  contributions  of  all  inputs  connected  to  the  node  of 
interest  can  be  reduced  to  a  single  interval.  This  interval  is  merged  (using  U)  with  the  contribution 
from  the  node's  current  logic  state 


contribution  of  current  logic  state  = 


CL  if  current  logic  slate  =  0 

CH  if  current  logic  state  =  1 

[CH.CL  J  if  current  logic  state  =  X 


to  give  the  final  interval  characterizing  die  node's  new  logic  state. 

As  an  example  of  how  the  ncw-valuc  calculation  works,  consider  die  following  circuit: 


(5.1) 


-  98  ■ 


Figure  5.9.  Example  circuit 


Assume  that  the  current  logic  state  of  the  output  is  0.  The  ncw-valuc  calculation  for  this  circuit  is 
shown  in  the  following  figure. 


Figure  5.10.  A ’en-value  calculation  for  circuit  in  figure  5.9 

The  final  interval  for  the  output  node  is  Cl.  U  \\.DL]  =  [CL.DL]  which  corresponds  to  a  logic  low 
state.  This  makes  sense;  the  previous  state  of  the  output  node  was  logic  low,  so  the  uncertain 
connection  to  the  inverter  docs  not  affect  its  logic  state,  just  the  strength  with  which  its  driven.  Note 
that  it  is  important  to  merge  the  values  of  paths  that  join  before  continuing  with  the  calculation  since 

swilch(a.  a  U  /?)  *  swilehia.  a)  U  switch  (a,  f$)  (5.2) 

when  using  this  particular  value  set  and  switch  function.  For  example,  if  the  WH  and  1)1.  values  had 
been  merged  after  transmission  by  the  switch  in  the  unknown  state,  the  final  interval  for  the  output 
node  would  have  been  [DH.Wl  ],  which  corresponds  to  an  X  logic  state.  The  calculation  described 
here  performs  all  possible  merges  before  transmitting  the  result  through  the  appropriate  switch. 


5.3.2.  The  global  simulation  algorithm 


this  section  outlines  the  basic  steps  for  propagating  new  information  about  the  inputs  to  the  rest 
of  the  network,  recalculating  node  sallies  (where  necessary)  using  the  global  value  calculation  in  the 
previous  section. 

When  a  node  changes  value,  it  can  affect  the  network  in  one  of  two  ways: 

(i)  direct!),  through  sourcc/drain  connections  of  conducting  transistors. 

(ii)  indirectly.  by  affecting  the  state  of  transistor  switches  controlled  by  the  changing 
node.  This  is  turn  can  cause  the  source  and  drain  nodes  of  those  switches  to 
change  value. 

The  global  simulator  accounts  for  these  two  effects  using  to  different  mechanisms.  Directly  affected 
nodes  are  handled  implicitly  by  the  ncw-valuc  computation  which  recomputes  new  values  for  all 
directly  affected  nodes  whenever  a  node  changes  value.  This  is  a  reasonable  organization:  if  A  directly 
affects  B,  then  B  directly  affects  A:  it  makes  sense  to  compute  both  values  at  the  same  time  since  they 
are  closely  related.  Direct  effects  arc  not  handled  implicitly,  however,  when  the  user  changes  the 
value  of  an  input  node.  In  this  case,  die  simulator  invokes  the  new-value  computation  on  die  input, 
not  to  recompute  the  input's  value  (which  is  set  by  die  user),  but  to  recompute  the  values  of  all 
directly  affected  nodes. 

The  indirect  effects  of  a  value  change  arc  managed  by  an  event  list  that  idenufics  all  transistor 
switches  that  have  changed  state.  Actually,  the  event  list  keeps  track  of  die  nodes  that  have  changed, 
but  this  is  equivalent  since  the  network  data  base  maintains  a  list  of  transistors  controlled  by  each 
node.  The  simulator  operates  by  removing  the  first  node  from  the  event  list,  and  then  performing  a 
new-value  computation  for  the  sources  and  drains  of  all  transistors  controlled  by  that  node.  The  ncw- 
valuc  computation  accounts  for  all  the  direct  effects  of  the  new  transistor  suite  and  adds  events  to  the 
event  list  if  indirect  effects  arc  present.  This  process  continues  until  the  event  list  is  empty,  at  which 
point  the  network  has  "settled"  and  the  simulator  waits  for  further  input 
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while  event  list  not  empty  { 

n  :  =  node  associated  with  first  event  on  event  list 
remove  first  event  from  event  list 
for  each  transistor  with  n  as  pate  node  { 
set  (.'OMIT ti;  flag  for  source  and  drain 

} 

for  each  transistor  with  n  as  gate  node  { 

if  c  OMIT  1 1  still  set  for  source,  compute  new  value  for  source  [fig.  5.14) 
if  COMiTii  still  set  for  drain,  compute  new  value  for  drain 

} 

} 


Figure  5.1 1.  Main  bop  of  global  simulation  algorithm 


Finding  nodes  affected  by  an  event  is  straightforward;  rccomputation  of  values  is  needed  for  the 
sources  and  drains  of  all  transistors  with  the  changing  node  as  gate.  For  example,  if  the  node  marked 
(*)  in  die  following  figure  changes,  nodes  B  and  C  need  recomputation. 


Figure  5.12.  Event  for  node  (*)  involves  nodes  B  and  C 


Of  course,  node  D  also  needs  to  be  recomputed,  as  will  be  discovered  during  the  processing  of  B  and 
C  (sec  below). 

To  recompute  the  value  of  a  given  nixie,  the  simulator  first  makes  a  connection  list  containing  all 
nixies  connected  to  the  first  nixie  by  a  path  of  conducting  transistors.  The  idea  is  to  start  with  a  nixie 
imown  to  be  affected  by  an  event,  and  then  find  that  node’s  electrical  neighbors,  and  so  on.  halting 
whenever  an  input  is  reached.  In  the  example  above,  if  the  (*)  nixie's  value  is  1.  the  connection  list 
for  nixie  B  contains  nodes  B.  C.  and  1).  If  the  (*)  nixie's  value  is  0.  the  connection  list  for  nixie  B 
contains  only  nixie  B.  Node  A  is  not  included  in  the  list  in  either  ease  because  it  is  not  connected  to 
nixie  B  by  a  path  of  conducting  transistors.  In  the  cixie  below,  which  computes  die  connection  list  for 
a  given  nixie,  the  terms  "source"  and  "drain"  arc  u.-eu  to  distinguish  one  terminal  nixie  of  a  transistor 
from  the  other,  and  do  not  imply  anything  about  the  terminals'  relative  potential. 
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initialize  list  to  have  starling  node  as  only  element 
set  pointer  to  beginning  of  list 
iM'i  I  lot  \n:=  false 
reset  capacitance  accumulators 
while  pointer  not  at  end  of  list  { 
n  :  =  node  currently  pointed  at 
add  capacitance  of  n  to  appropriate  accumulator 
for  each  "on"  transistor  with  source  connected  to  n  { 
if  drain  is  an  input.  INPUT  IOUND :  =  true 
else  if  dram  not  on  list,  add  drain  to  end  of  list 

} 

advance  pointer  to  next  list  element 


Figure  5.13.  Non-recursive  routine  to  build  connection  list 


In  addition  to  the  connection  list,  the  routine  sets  input  pound  to  true  if  the  tree  walk  discovered  at 
least  one  input,  and  maintains  three  capacitance  accumulators,  one  for  each  logic  state.  The 
connection  list  drives  the  new-value  computation: 

make  connection  list  starting  with  given  node  [fig.  5.13] 
if  no  inputs  found,  do  charge  sharing 
else  for  each  node  on  connection  list  { 

compute  interval  value  for  node  (fig  5.15] 
determine  new  logic  suite  using  Table  5.1 
if  different  from  old  logic  stale  { 
update  logic  state  to  new  value 
enqueue  new  event 

} 

} 

reset  compute  flag  for  each  node  on  connection  list 
Figure  5.14.  Subroutine  to  compute  new  value  for  node 


If  no  inputs  arc  found  while  building  the  connection  list  (input  pound  is  false),  the  group  of  nodes  is 
completely  isolated  from  any  inputs  and  a  charge  sharing  computation  determines  the  nodes’  new 
values.  Assuming  that  all  the  node  capacitors  arc  shorted  together,  the  resulting  voltage  is 

2 capacitors  at  logic  high 

voltage  of  shorted  capacitors  =  - — - -  (5.3) 

2i«li  capacitors 

Capacitors  with  a  logic  state  of  X  arc  assumed  to  be  charged  high  when  computing  the  maximum 
possible  voltage,  and  charged  low  when  computing  the  minimum  voltage: 


charge  sharing  value 


0 

1 


('high  +  <02 

(  total 

(,"'h-  >  0.8 
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where  CIOMi  is  ihc  sum  of  the  capacitance  accumulators.  ('i„Kh  is  tire  accumulator  corresponding  to 
logic  high,  and  C.v  is  the  accumulator  corresponding  to  logic  X. 

If  one  or  more  inputs  are  found  (IM’U  l  OlMi  is  true),  the  value  of  each  node  is  determined  in 
accordance  with  the  procedure  described  in  the  previous  section.  The  interval  value  is  calculated  for 
each  node  in  turn  and  the  node's  new  logic  state  is  computed  using  Table  5.1.  New  events  arc  added 
to  the  end  of  the  event  list  whenever  a  node  changes  value.  If  a  changing  node  is  already  on  the 
event  list,  nothing  happens  (the  node  is  not  moved  to  the  end  of  the  list). 

For  efficiency,  each  affected  node’s  value  is  only  computed  once  while  processing  a  given  event. 
The  connection  list  ensures  that  all  affected  nodes  are  recomputed;  the  COMPLTt:  flag  ensures  that 
once  a  node  has  appeared  on  some  connection  list,  it  will  not  be  resubmitted  for  processing  during  the 
current  event. 

The  computation  of  a  node's  value  is  easily  described  by  a  recursive  procedure  which  analyzes 
the  surrounding  network: 

if  node  is  logic  low  input  { 
return  Dl. 

}  else  if  node  is  logic  high  input  { 
return  DH 

}  else  { 

i  ocai  tv  :=  value  specified  by  equation  5.1 
set  visjtfd  flag  for  current  node 

for  each  "on”  transistor,  t,  with  source  connected  to  current  node  { 
if  drain  does  not  have  visitld  flag  set  { 

recursively  determine  interval  value  for  drain  node 
1XJCAL  IV  :=  local  tv  U  swiich(a,.  drain’s  interval  value) 

} 

} 

reset  VISITED  flag  for  current  node 
return  l  OCAL  IV 

} 

Figure  5.15.  Subroutine  to  compute  inten'a!  value  for  node 

The  variable  lOCAL  IV  is  a  stack -allocated  local  variable  of  the  subroutine.  Returning  to  the  example 
in  figure  5.12.  assuming  that  the  (*)  node’s  value  is  1,  and  that  die  old  values  for  II.  C,  and  D  are 
R  - 1,  C  =0,  and  I)  =0,  the  following  calls  arc  made  when  computing  the  new  value  for  node  C: 
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compute  params(C) 

I  OCA  I  IV  =  CL 
compute  params(l)) 

I  OCA  I  IV  =  Cl 
com  pu  te_  params(  VDD) 
return  l)H 

I  00  A I  IV  =  Ci  U  WII  =  WH 
computc_params(GN  D) 
return  1)1. 

lOCAl  IV  =  Wit  U  HI.  =  DL 
return  I)L 

lOCAl.  IV  =  Cl  U  [X.DI-l  =  [C1..DI] 
compute_params(B) 

I  OCAL  IV  =  Cll 

return  CH 

lOCAl.  IV  =  (C1..D1] U  CH  =  ICII.ni] 
return  [CH.DI.J 

Figure  5.16.  Trace  of  interval  value  computation  for  example  in  figure  5.12 


Marking  each  visited  node  (by  setting  its  visited  flag)  avoids  cycles;  this  keeps  the  tree  walk 
expanding  outward  from  the  starting  node.  The  VISITED  flags  are  reset  as  the  routine  backs  out  of  the 
tree  walk,  so  all  possible  paths  through  the  network  are  eventually  analyzed. 


b 

b 


(a)  original  circuit 


(b)  circuit  as  seen  by  tree  walk 


Figure  5.17.  The  tree  walk  traces  out  all  possible  paths 


If  the  network  contains  cycles,  the  tree  walk  might  lead  to  more  computation  than  a  scrics/parallel 
analysis:  this  is  a  problem  for  circuits  containing  many  potential  cycles  (such  as  barrel  shifters), 
especially  during  initialization  when  many  of  the  paths  arc  conducting  because  control  nodes  are  X. 
To  speed  up  the  calculation,  a  node's  visited  flag  can  be  left  set.  restricting  the  search  to  a  single  path 
through  a  cyclic  network.  I  bis  technique  produces  correct  results  only  if  paths  leading  away  from  a 
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node  -ire  explored  in  order  of  increasing  resisunee.  u\.  one  must  ensure  llint  the  first  time  .1  node  is 
re.iehed.  it  is  hv  die  path  of  least  resistance.  Of  course,  the  flags  must  he  reset  once  the  entire 
computation  is  complete:  fortunately.  die  connection  list  provides  a  hands  wav  of  Undine  all  die  nodes 
dial  are  visited  without  resorting  to  set  another  tree  walk.  Another  altcrnatnc  for  speeding  up  the 
calculation  is  the  caching  technique  described  in  section  4.2. 

5.3.3.  Interesting  properties  of  the  global  algorithm 

The  esent  list  serves  to  focus  the  attention  of  the  global  simulator:  new  values  arc  computed 
only  for  nodes  which  appear  on  the  event  list  or  which  are  electrically  connected  to  cscnt-list  nodes. 
Portions  of  the  network  that  arc  quiescent  arc  not  examined  by  the  simulator.  Algorithms  i  have 
this  property  are  said  to  be  selective-trace  or  event-driven  algoridims  and  generally  run  rr  faster 
than  algorithms  which  arc  not  event  driven  [Szygenda75].f 

An  interesting  implication  of  selective  trace  is  that  special  care  must  be  taken  to  e  .oat 
"constant"  nodes,  such  as  the  output  of  an  inverter  with  its  input  tied  to  GND.  are  processed  at  least 
once  (otherw  ise  they  will  have  the  wrong  values).  One  technique  is  to  treat  vnr>  and  GND  as  ordinary 
inputs  when  first  starting  a  simulation  run  —  sort  of  a  power-up  sequence  as  von  and  G\'D  change 
from  X  to  1  and  0  respectively.  Computing  both  the  direct  and  indirect  consequences  of  changes  in 
VDI>  and  GM)  might  involve  a  tremendous  amount  of  computation  since  the  whole  circuit  is  affected; 
often  only  computing  the  indirect  consequences  is  a  sufficient  and  less  costly  alternative. 

Although  there  is  no  explicit  mention  of  time  in  the  global  simulator,  die  first-in.  first-out  (m  o) 

processing  of  events  imposes  some  ordering  on  the  changes  of  node  values.  This  ordering  is  similar  to. 

but  not  the  same  as.  the  unit-delay  ordering  used  by  many  gate-level  simulators.  In  an  event-driven 

unit-delay  algorithm,  the  output  of  each  gate  that  had  an  input  change  is  recomputed  using  the  current 

alucs  of  the  input  nodes.  The  new  output  values  are  saved  and  imposed  on  the  network  only  after 

processing  all  gates.  The  net  effect  is  Unit  each  computation  cycle  (representing  a  unit  of  time) 

propagates  information  through  one  level  of  gate,  ic..  each  gate  has  unit  delay.  Because  changes  in 

node  values  are  imposed  all  at  once,  values  change  simultaneously,  which  can  lead  to  problems  in 

t  exceptions  to  this  rule  arc  some  hardware-based  snmilalton  algonihms.  such  as  programs  run  on  ihc  Vorklown 
Simulation  l-'ngine  [l’fislcrR2J  the  builders  of  (he  YSI  point  out  dial  simulations  might  well  run  slower  because  the 
extra  comnrumc.il ion  and  branching  needed  to  implement  selective  trace  would  compromise  Ihe  parallelism  and  pipe¬ 
lining  used  lo  great  advamngc  in  ihc  V  SI  However,  if  suflicicmlv  large  portions  of  ihc  circuits  could  be  ignored, 
the  ox  ahead  of  selective  trace  could  be  xvonh  the  investment  (see  Chapter  b) 
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circuits  containing  feedback  paths. 

The  global  simulator  implements  a  pseudo  unit-delay  algoridim.  New  dents  arc  added  to  the 
end  of  the  event  list,  so  die  oldest  changes  are  processed  before  any  consequences  of  those  changes 
are  processed.  Thus,  i  ll  o  event  management  leads  to  the  same  sequence  of  gate  evaluations'  as  a 
unit-delay  algorithm.  However,  because  die  global  algorithm  changes  values  in  die  network 
incremental!)  rather  dian  all  at  once,  it  is  possible  to  find  circuits  that  behave  different!)  under  die  two 
simulators: 


(a)  unit  delay  (b)  pseudo  um'.-delay 

Figure  5.18.  Circuit  that  distinguishes  unit-delay  from  pseudo  unit-delay 


A  0-1  transition  on  the  input  causes  a  unit-delay  algoridim  to  loop  forever.  The  global  algoridim 
predicts  only  one  transition  —  the  output  of  whichever  gate  it  processes  first.  Neither  answer  is 
completely  correct:  the  actual  circuit  enters  a  meta-stable  state  on  a  0-1  in.  ut  transition,  eventually 
settling  to  a  particular  configuration  determined  by  subtle  differences  in  the  gains  of  the  two  gates.  It 
will  not  remain  in  the  meta-stable  suite  forever,  so  an  infinite  oscillation  is  a  poor  prediction.  On  the 
other  hand,  the  final  configuration  chosen  by  the  global  simulator  depends  on  die  order  of  some  list  in 
the  network  d ata  base.  The  predicted  outcome  is  the  same  each  time,  not  necessarily  the  best 
prediction. f  The  global  simulator  docs  not  offer  a  general  solution  to  the  oscillation  problem;  both 
simulators  will  oscillate  on  the  following  circuit. 


tjllryaruKI]  sugi'CMs  th.it  ihe  oscillation  can  he  detected  and  the  offending  node  values  replaced  by  X.  hut  the  tech¬ 
nique  for  determining  the  number  of  oscillations  to  allow  yields  answers  so  laiyie  lor  circuits  of  any  substantial  si/e 
that  this  is  not  a  very  practical  alternative 
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Figure  5.19.  Circuit  which  causes  bath  simulators  to  oscillate 

Along  liie  some  lines,  the  global  simulator  predicts  that  die  output  of  the  circuit  below  will 
oscillate  when  the  input  changes  from  1  to  0. 


Figure  5.20.  Circuit  with  a  node  that  is  both  an  input  ami  output 


The  actual  output  quickly  rises  to  the  balance  point  of  the  pullup/pulldown  combination.  In  a  logic- 
lc\ cl  simulation,  this  corresponds  to  finding  a  solution  to  the  equation  a  =  ~~'a  which  has  the  solution 
a  =  X  (a  reasonable  logic-level  representation  for  the  balance  point).  This  example  is  drawn  from  a 
larger  class  of  circuits  where  a  node  is  both  an  input  and  output  of  the  circuit.  Since  the  new -value 
computation  uses  current  transistor  states  (determined  by  current  node  values)  to  predict  the  new 
values,  it  is  impossible  to  predict  the  value  of  a  node  that  depends  on  its  own  value.  'Ihis  limitation 
has  not  proven  to  be  a  problem  in  practical  circuits. 

5.4.  The  local  switch  model 

It  is  interesting  to  speculate  about  replacing  the  tree  walk  performed  by  die  global  simulator  with 
a  strictly  local  computation.  After  all,  die  models  of  transistor  behavior  presented  in  Chapter  3  show 
that  a  transistor  is  controlled  by  the  voltages  of  its  three  terminal  nodes,  Lc..  each  transistor  operates 
independently,  basing  its  behavior  on  only  local  information  available  at  its  terminals.  'Ihc  simulation 
model  described  in  this  section  works  in  much  die  same  way.  Ihc  basic  operation  involves  updating 
the  terminal  node  values  of  a  transistor  switch  using  only  information  about  their  previous  values  and 
the  state  of  the  switch. 
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Rclax.ition-b.iscd  algorithms  leave  one  a  little  nervous.  Will  the  relaxation  terminate?  Docs  the 
final  answer  depend  on  the  order  in  which  the  indiv.j..a!  computations  arc  performed?  these 
questions  arc  answered  below,  after  a  description  of  die  algorithm  itself. 

5.4.1 .  Node  values  in  the  local  switch  model 

The  set  of  node  values  and  the  computation  developed  for  the  global  simulator  must  be  adapted 
for  use  by  die  local  simulator.  The  necessity  for  an  adaptation  is  explained  at  die  end  of  section  5.4.2. 
(The  discussion  is  postponed  until  after  die  local  simulation  algorithm  has  been  presented,  when  it  will 
be  easier  to  explain  why  the  global  simulator's  techniques  do  not  work  in  the  local  simulator's  context.) 

In  the  local  simulator,  a  node  value  is  a  pair 

<high.!ow> 

that  separately  lists  what  type  of  connection  exists  to  each  of  the  two  possible  input  signals.  The  high 
component  summarizes  what  is  know  n  about  paths  to  vdd.  and  the  low  component  describes  paths  to 
GND.  Ignoring  for  the  moment  switches  with  gates  of  X.  four  types  of  connections  can  be 
distinguished  for  each  component: 

oo  no  paths  to  inputs,  no  charge  storage. 

S  charge  storage. 

1  there  is  a  path  to  the  appropriate  input,  but  it  passes  through  one  or  more 
depletion  switchcs. 

0  there  is  a  path  of  conducting  n-channcl  (gate  =  1)  and  p-channel  (gate  =  0) 
switches  to  the  given  input. 

A  switch  with  a  gate  of  X  may  or  may  not  make  a  connection:  the  resulting  padi  is  characterized  by  an 

4 

interval  describing  the  range  of  alternatives,  (j)  =  6  intervals  arc  needed  to  describe  all  possible 
combinations  of  paths. 

ITic  value  of  VDD  is  <0.oo>  and  of  GND  is  <oo.0>;  some  other  examples  are  shown  in  the 
following  figure. 
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(a)  <1. 0>  (bXlO.l).  S>  (cXS.  S> 

Figure  5.21.  Examples  of  node  values  in  the  local  simulator 


This  organization  provides  for  many  more  values  than  actually  needed  by  the  simulator:  many  of  the 
values  make  distinctions  that  are  not  important  in  determining  a  node's  logic  state.  For  example.  <1,0> 
and  <S.0>  both  represent  values  corresponding  to  pulled-down  nodes  —  it  does  not  matter  what  the 
high  component  contributes  if  it  is  weaker  than  the  low  component.  Ihe  advantage  of  this  notation  is 
the  ease  of  computing  what  a  given  signal  looks  like  from  the  other  side  of  a  transistor  switch: 


(a)  <1.  0>  (b)  <(l.°oj,  [O.oo J>  (c)  <1,1>  (d)  <oo.oo> 

Figure  5.22.  <  l.0>  value  as  seen  across  various  transistor  switches 

This  will  prove  very  useful  in  describing  the  update  operation  below. 

Using  the  technology  developed  in  section  5.1.1.  a  lattice  can  be  constructed  that  indicates  the 
relative  ordering  of  the  various  component  values: 


[S.«] 


oo 

Figure  5.23.  lattice  for  the  ten  possibn  component  values 

The  U  operation  can  be  used  to  calculate  die  result  of  considering  two  paths  in  parallel: 

<h i,  /i>  U  <h2.  l2>  =  <h i  U  hi.  h  U  12>  (5.5) 

Bach  component  is  merged  separately  according  to  the  lattice  given  above.  Similarly .  two  values  can 
be  ordered  by  comparing  their  components: 

<h i.  /]>  <  <h2.  h>  iff  h\<h2  and  / 1  <  l2  (5.6) 

A  logic  state  can  be  associated  with  a  value  <h.l>  using  the  following  table: 


(S.oo)  oo 

0  0 

0  0 

X  0 

X  X 

0  0 

X  0 

X  X 

X  0 

X  X 

X  X 

'Fable  5.3.  I  ogic  state  associated  with  <h,f> 


h 

j  0  [0.1]  [O.S|  [0.00]  1  (1.SJ  [1.00]  s 

oxxxxoooo 

[0.1]  X  X  X  X  X  X  X  0 

[0.S]  xxxxxxxx 

[0.OO]  [X  X  X  X  X  X  X  X 
1  1  X  X  X  X  X  X  0 

[1JS]  1  X  X  X  X  X  X  X 

[1.00]  1  X  X  X  X  X  X  X 

SI  1  XXI  XXX 
(S.  00]  1  1  X  X  1  X  X  X 

00  l  1  1  X  1  1  X  1 


§.4.2.  The  local  simulation  algorithm 

The  local  simulator  implements  a  relaxation-based  calculation  for  propagating  input  values 
through  the  network.  The  calculation  has  three  major  steps: 

Step  1.  Determine  the  state  of  each  transistor  switch  from  its  type  and  the  logic 
state  of  its  gate  node.  If  no  switches  arc  found  that  changed  state  since 
the  last  examination,  the  network  is  said  to  have  "settled"  and  the 
simulator  waits  for  more  input 

Step  2.  Reset  each  non-input  node  value  to  its  charged  value,  a  value  that 
corresponds  to  the  node’s  last  logic  state  but  does  not  have  sufficient 
strength  to  force  the  value  of  any  neighboring  nodes. 

Step  3.  Repeatedly  pick  a  transistor  and  update  the  values  of  its  source  and  drain 
nodes  according  to  the  formula  given  below,  continuing  until  the 
relaxation  is  complete  (no  node  changes  value  as  the  result  of  an  update). 

Upon  completion,  return  to  Step  1. 

Each  of  these  steps  is  described  in  more  detail  below. 

Figure  5.5  shows  how  a  switch’s  state  is  determined  from  its  type  and  the  logic  state  of  its  gate 
node.  Once  determined,  the  switch  state  remains  stable  through  Steps  2  and  3  even  if  the  gate 
changes  value.  This  arrangement  is  necessary  for  the  correct  operation  of  the  simulator  since  a  node's 
value  might  temporarily  be  incorrect  during  the  relaxation  computation  while  information  continues  to 
propagate  towards  the  node  from  various  inputs.  For  example,  the  output  of  a  NAND  gate  may 
momentarily  appear  to  be  pulled-up,  because  the  near-by  pullup  affects  the  node's  value  before 
information  can  propagate  from  GND  up  the  pulldown  chain.  Since  there  are  no  guarantees  about  the 
ordering  of  updates,  a  node’s  value  is  known  to  be  correct  only  when  the  relaxation  process 
terminates. 

Step  2  makes  sure  that  the  relaxation  starts  off  with  a  clean  slate;  when  this  step  is  complete, 
only  input  nodes  have  values  that  can  cause  the  values  of  neighboring  nodes  to  change.  This  ensures 
that  values  for  non-input  nodes  are  determined  exclusively  by  the  values  of  the  input  nodes. 

<oo,  5  >  current  logic  state  =  0 

charged  value  =  <S,  00>  current  logic  state  =  1  (5.7) 

<S,  S>  current  logic  state  =  X 

If  a  node  is  not  connected  to  any  input,  the  charged  value  is  an  accurate  representation  of  its  final 
value.  The  update  calculation  performs  a  rudimentary  charge  sharing  computation;  a  charged  node 
can  become  connected  to  another  charged  node  with  the  same  logic  state,  and  still  maintain  its  value. 
Connection  to  a  charged  node  with  a  different  logic  state  results  in  both  node  values  becoming  <S.S>. 
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Note  that  prechargc/dischargc  circuits  arc  simulated  correctly. 

An  update  operation  involves  the  source  and  drain  nodes  of  a  single  transistor  switch.  Hie  new 
values  of  the  source  and  drain  arc  calculated  from  their  old  values  and  the  state  (a)  of  the  switch: 

V source  =  V source  U  Switl'h(o,  V drain  ) 

.  (5.8) 

''•’dram  —  vJrum  U  JH7/C7l((T,  V source ) 

Ihc  function  switclifo.  value)  formalizes  our  intuition  about  the  effect  on  a  value  as  it  passes  through  a 
switch  in  a  given  state  (see  figure  5.22).  The  new  value  of  a  terminal  node  is  the  result  of  merging  its 
old  value  with  the  old  value  of  the  other  terminal  node  after  it  has  passed  through  the  switch. 


00  a 

<h,  />  a 

switch(o.  <h, !»  =  <h  +  [Q  OO]  ,  +  {0  oo]>  ff 

<h  +  [1,1J.  1  +  ll.l]>  a 

where  "+"  is  the  scries  operation  described  in  the  following  table: 


+ 

!  [0.0] 

[0.1] 

[0.S] 

[O.oo] 

[1.1] 

U-S] 

[l.oo] 

[S.S] 

[S.oo] 

[00,00] 

10.0] 

10.0] 

[0.1) 

!  lo.i] 

[0,1] 

[0.S] 

1  [0.S] 

[0.S] 

[OS] 

[o.°°l 

!  [O.oo] 

[O.oo] 

[O.oo] 

[O.oo] 

[UI 

(1.1) 

[1.1] 

[1.S] 

[l.w] 

[UJ 

[1.S] 

!  [I  S] 

[1-S] 

[IS] 

U.W] 

[l.S] 

[IS] 

[l.»J 

[l.°°l 

[1.00] 

[1.00] 

[1.00] 

[l.oo] 

[1.00] 

IS.S] 

IS.S] 

[S.S] 

[S.S] 

[S.oo] 

[S.S] 

[S.S] 

[S.oo] 

[S.S] 

[S.°0] 

,  (s  °°l 

[S  00] 

(S.ooj 

[S.ooj 

[S.oo] 

[S.oo] 

[S.oo] 

[S.oo] 

[S.oo] 

[00.00] 

[00.00] 

[00,00] 

[oo.ooj 

[oo.oo] 

[oo.oo] 

[00,00] 

[00,00] 

[00,00] 

[oo.oo] 

[00.00] 

Table  5.4.  Series  operation  for  local  simulator 


In  general,  the  local  algorithm's  predictions  arc  more  pessimistic  than  those  of  the  global 
simulator.  The  following  figure  illustrates  die  analysis  performed  by  die  local  simulator  for  the  circuit 
shown  in  figure  5.9.  (The  global  simulator's  analysis  is  shown  in  figure  5.10) 
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(a)  original  configuration  (b)  alter  network  settles 

Figure  5.24.  Local  simulator  analysis  for  circuit  in  figure  5. 9 


As  shown  in  figure  5.24(b),  the  local  simulator  predicts  the  logic  state  of  the  output  node  to  be  X  —  a 
pessimistic  answer.  (Ihe  global  simulator  predicts  a  logic  state  of  0.)  On  the  other  hand,  the  local 
simulator  cannot  simply  adopt  the  value  set  and  computation  of  the  global  simulator.  'Ihe  reason  why 
is  illustrated  by  the  following  figure. 


(a)  original  configurauon  (b)  update  order:  #1.  #2. ...  (c)  update  order:  #1.  *3. 


Figure  5.25.  Global  simulator’s  compulation  using  update  operations 

The  figure  shows  the  final  node  values  (i.e..  the  values  after  the  network  has  settled,  and  further 
updates  make  no  change  to  the  network),  assuming  that  the  first  few  updates  were  performed  in 
different  orders.  Figure  5.25(b)  shows  the  final  node  values  if  switch  it  1  is  updated  first,  followed  by 
switch  # 2 .  Figure  5.25(c)  shows  the  final  node  values  if  switch  it  1  is  updated  first,  followed  by 
switch  it  3.  As  one  can  see.  the  value  of  the  output  node  differs  in  the  two  examples. 

If  the  local  simulator's  predictions  of  the  final  node  values  arc  to  be  independent  of  update 
order,  it  must  be  the  case  that 

switch  (a,  a  U  /$)  =  switchia.  a)  U  swilch(o ,  ft)  (5.10) 

In  other  words,  it  cannot  matter  if  early  estimates  of  a  node's  value  (a)  are  transmitted  to  neighboring 
nodes  before  additional  information  (/?)  arrives.  Unfortunately,  equation  5.10  is  in  direct  conflict  with 
equation  5.2  which  indicates  that  order  makes  a  difference  in  die  analysis  of  certain  circuits  (such  as 


the  one  in  figure  5.9)  when  using  the  global  simulator's  \alue  set.  Thus,  the  local  simulator  cannot 
simply  adopt  the  global  simulator’s  value  set. 

5.4.3.  Interesting  properties  of  the  local  algorithm 

In  order  to  answer  the  questions  raised  when  first  introducing  the  local  algorithm,  some 
definitions  will  be  useful.  I  ct  S  be  the  set  of  switch-state  vectors  oio;  •  •  ■  a,  where  t  is  the  number 
of  transistor  switches  in  the  network.  Similarly,  let  V  be  the  set  of  node-value  vectors  v|vi-  •  •  v„ 
where  n  is  the  number  of  nodes  in  the  network.  Then  SXV  is  the  set  of  possible  network  states. 

Definition.  Let  X  and  Y  be  network  states.  X  >  Y  if  Sx  =  Sy  and  V'x  >  Vy 
where  comparison  between  vectors  is  done  component  by  component. 

The  update  operation  changes  one  network  state  to  another;  one  writes  X-*Y  if  a  sequence  of  /cro 
or  more  updates  changes  the  network  state  X  into  the  network  state  Y.  X  -*m  Y  means  that  m  or 
fewer  updates  will  change  X  into  Y. 

The  update  operation  can  potentially  change  two  elements  of  die  node-value  vector;  the  switch- 
state  vector  is  never  affected  by  an  update.  Not  every  update  causes  the  network  state  to  change.  For 
example,  if  the  update  chooses  an  open  switch,  the  resulting  network  state  will  be  the  same  as  the 
original  state.  In  the  presentation  below,  it  is  useful  to  distinguish  those  updates  that  result  in  a 
change  in  the  network  stale  from  those  that  do  not: 

Definition.  Let  X  and  Y  be  network  states.  X  Y  if  X  -»i  Y  and  X  *  Y . 

In  fact,  X  =»  )'  implies  Y  >  X ,  a  simple  consequence  of  equation  5.9  and  the  definition  of  U.  A 
stable  network  state  is  one  which  docs  not  change  as  the  result  of  any  update: 

Definition.  Let  X  be  a  network  state.  X  is  stable  if,  for  any  network  state  Y,  X  -*  Y 
implies  X  -  Y. 

It  follows  directly  from  this  definition  that  a  state  is  stable  if  and  only  if  no  =»  operations  arc  possible 
on  the  state.  Once  a  stable  state  is  reached,  the  relaxation  process  can  safely  be  terminated  since 
further  updates  will  not  change  the  network  state.  This  suggests  the  following  metric  for  measuring 
how  far  the  relaxation  process  has  to  go: 

Definition.  I  ct  ,Y  be  a  network  state.  onlcr(X)  is  defined  to  be  the  largest  integer  m 
such  that  there  exist  states  Y| .  Ym  where  X  =>  Y|  =>  ■  ■  ■  =>  Ym. 


ITic  termination  of  the  relaxation  process  is  assured  by  the  following  theorem: 


Theorem  5.1.  For  any  network  suite  X,  order(X)  is  finite. 

The  proof  is  based  on  the  observation  that  there  arc  only  finitely  many  network  nodes  and  possible 
node  values.  Ibis  means  for  any  given  network  state  X.  there  are  finitely  many  states  V  such  that 

Y  >  .V.  Since  each  =>  operation  produces  a  state  strictly  greater  than  its  predecessor,  one  can 
perform  the  =>  operation  only  finitely  many  times  before  all  the  possible  states  are  exhausted.  I 

For  a  given  starting  network  state.  Theorem  5.1  tells  us  that  a  stable  state  can  be  reached  with 
only  a  finite  number  of  =>  operations.  In  fact,  one  can  prove  that  there  exists  a  unique  stable  state 
for  any  network  state,  but  first  we  must  lay  a  little  more  groundwork. 

Lemma  5.2.  Let  IF  and  X  be  network  states.  If  orderin’)  =  m  and  W  =>  X ,  then 
ordcr(X)  <  m. 

Suppose  that  order (X)  >  m,  then  there  exists  a  sequence  of  =>  operations 

W  =>  X  =>  Y  i  =>  Yorder{x).  This  implies  order(  IF)  >  m  +  1.  a  contradiction.  I 

lamina  5J.  (Church-Rosser  property)  Let  IF,  X.  and  Y  be  network  states.  If 
W  -*i  -V  and  W  -»i  then  there  exists  a  network  state  Z  such  that  X  Z  and 
Y  -*Z. 

Appendix  1  presents  a  proof  based  on  a  case  by  ease  analysis  of  the  possible  choices  for  X  and  Y, 
demonstrating  for  each  case  a  sequence  of  updates  that  lead  to  a  common  state  Z. 

This  sets  the  stage  for  proving  the  uniqueness  of  the  stable  state.  For  readers  acquainted  with 
the  lambda  calculus,  the  following  theorem  has  a  familiar  ring.  There  are  many  similarities  between 
the  update  operation  and  X-convcrsion;  the  discussion  of  normal  forms  and  the  Church-Rosser 
theorem  found  in  (Curry 74)  inspired  the  concept  of  stable  states  and  the  existence  and  uniqueness 
theorems  presented  here. 

Theorem  5.4.  Let  IF.  X.  and  Y  be  network  states.  I f  W  -*  X  and  I V  Y,  then 

there  exists  a  network  state  Z  such  that  X  -*  Z  and  Y  -*  Z. 

The  proof  proceeds  by  induction  on  the  order  of  W.  If  orderin’)  =  0.  then  IF  is  stable  and  so 
IF  =  A'  =  Y  =  Z  .  Without  loss  of  generality,  if  order(lY)  >  0.  one  can  assume  X  >  W  and 

Y  >  IF  since  if  this  were  not  the  case,  the  result  follows  trivially.  If  orderin’)  =  1.  the  result  follows 
as  a  direct  consequence  of  l  emma  5.3.  To  show  for  orderin’)  =  «  +  l,  first  note  that  there  exist 
stales  A  and  f)  such  that  IF  A  -*  3'  and  IF  =»/{-*  J  .  Ihen,  by  Lemma  5.3.  there  aiso  exists  a 
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state  C  such  that  A  -*  C  and  B  -*  C. 

order  =  n  +  1 

W  -J 

A  B  ’ 

J  \  order  <  n 

\  A  /  ’ 

\  / 

z 

Figure  5.26.  Relationship  between  slates  in  proof  for  Theorem  5.4 

Using  Lemma  5.2,  note  that  the  orders  of  A,  B,  and  C  are  all  less  than  n  + 1.  Thus,  by  the  induction 
hypothesis,  there  exists  a  state  D  such  that  X  D  and  C  -*  D.  Similarly,  there  exists  a  state  E  such 
that  Y  E  and  C  E,  also  by  the  induction  hypothesis.  Finally,  by  a  third  appeal  to  the 
induction  hypothesis,  there  exists  a  state  Z  such  that  D  Z  and  E  -*  Z .  I 

Taken  together.  Theorems  5.1  and  5.4  imply  the  following  corollary: 

Corollary  5.5.  Let  A'  be  a  network  state.  There  exists  a  unique  network  state  Y  such 
that  Y  is  stable  and  X  =>  •  •  •  ^  Y. 

Thus,  the  relaxation  process  terminates  for  any  starting  network  configuration,  yielding  the  same  stable 
state  regardless  of  the  order  chosen  for  performing  the  updates. 

One  of  the  attractions  of  the  local  algorithm  is  the  opportunity  it  affords  for  parallel  processing, 
especially  during  the  relaxation  process.  Allowing  parallel  updates  introduces  the  problem  of  merging 
conflicting  node  values  at  the  end  of  the  updates.  The  simplest  solution  is  to  allow  updates  to  happen 
simultaneously  only  if  they  operate  on  separate  portions  of  the  network  state.  With  this  restriction, 
each  node  is  involved  in  at  most  one  update  operation,  and  the  potential  for  conflict  is  avoided.  If  the 
number  of  available  processors  is  a  lot  smaller  than  the  number  of  nodes  in  the  network,  there  is  only 
a  small  probability  of  a  processor  lying  idle,  because  there  are  an  insufficient  number  of  allowable 
updates. 

Parallel  implementations  that  avoid  conflicting  updates  are  covered  by  the  existence  and 
uniqueness  results  obtained  above,  since  it  is  easy  to  convert  the  set  of  updates  performed  at  any  time 
step  into  an  equivalent  sequence  of  sequential  updates.  This  approach  has  sufficient  parallelism  to 
keep  many  current  parallel  architectures  quite  busy.  However,  there  are  architectures  on  the  drawing 
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boards  with  very  large  numbers  of  processors:  it  is  interesting  to  speculate  about  algorithms  dial  can 
usefully  employ  as  many  processors  as.  say .  there  arc  transistors  in  the  network. 

To  explore  the  possibilities,  imagine  a  multi  processor  constructed  of  die  following  elements: 


Figure  5.27.  Simulator  processing  elements 


Both  types  of  elements  synchronize  their  operation  to  a  four-phase  global  clock: 

Phase  1.  The  transistor  element  samples  the  values  of  its  source  and  drain 

connections  and  calculates  new  values  using  internal  information  about  its 
type  and  current  state. 

Phase  2.  Ihc  newly  updated  values  are  driven  on  to  the  source  and  drain 

connections  by  the  transistor  elements. 

Phase  3.  Fach  node  element  samples  one  of  its  three  connections  and  computes 
the  least  upper  bound  of  the  sampled  value  and  its  stored  state.  The 
connections  can  be  sampled  in  any  convenient  order;  the  only 
requirement  is  that  a  connection  not  be  ignored  indefinitely. 

Phase  4.  The  node  elements  drive  their  connections  with  the  value  computed 
during  Phase  3. 

Note  that  the  node  element  is  particularly  capricious;  it  ignores  two  of  its  three  connections  in  any 
given  cycle.  This  complicates  the  notion  of  an  update  since  there  is  no  guarantee  that  the  two  node 
elements  attached  to  the  source  and  drain  connections  of  a  transistor  element  will  be  listening  when 
the  results  of  an  update  arc  made  available.  It  becomes  especially  confusing  when  one  of  the  elements 
is  listening  and  one  is  not.  which  results  in  "half’  an  update.  Of  course,  one  can  conceive  of  less 
bizarre  node  elements,  but  if  it  is  possible  to  prove  correct  operations  under  the  proposed  conditions,  a 
much  wider  class  of  parallel  architectures  will  be  appropriate  for  the  local  algorithm. 

The  elements  arc  wired  together  in  a  way  that  mirrors  the  topology  of  the  network  to  be 


simulated;  multiple  node  elements  are  used  to  model  network  nodes  with  a  large  number  of 
connections. 
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GND  A 


(a)  circuit  schematic  (b)  element  interconnect 

Figure  5.28.  Example  wiring  diagram  for  simulator  elements 

By  providing  one  processor  per  transistor  and  node,  this  implementation  exhibits  all  the  parallelism 
one  could  reasonably  expect.  Steps  1  and  2  of  the  local  algorithm  are  accomplished  in  a  single  clock 
cycle.  During  Step  3.  an  update  calculation  for  each  transistor  is  performed  every'  clock  cycle.  A 
wired-or'ed  signal  visiting  all  the  node  elements  can  detect  when  the  relaxation  process  is  complete;  a 
similar  signal  connected  to  all  transistor  elements  can  indicate  when  the  network  has  settled. 

This  scheme  is  not  as  fanciful  as  it  seems  —  the  Connection  Machine  project  [Hillis81]  now 
underway  at  the  M.I.T.  Artificial  Intelligence  Laboratory  has  an  architecture  well  suited  to  an 
implementation  similar  to  the  one  described  above.  Fully  configured,  its  one  million  elements  would 
be  able  to  simulate  sizeable  circuits  at  very  high  speeds.  However,  the  real  purpose  in  proposing  this 
architecture  is  to  provide  a  vehicle  for  analyzing  the  operation  of  the  local  algorithm  in  a  parallel 
environment. 

A  key  insight  into  the  design  of  a  parallel  engine  is  that  the  value  stored  by  each  node  element 

must  be  non-decreasing  with  time,  Le.,  if  v,- . v/  arc  the  values  of  node  element  i  at  successive  clock 

cycles,  then  v/(  <  •  •  •  <  v/(.  The  "ratcheting"  of  node  values  up  the  lattice,  which  was  crucial  in 
showing  termination  of  the  relaxation  in  a  sequential  implementation,  must  be  preserved  in  the  parallel 
implementation.  With  this  in  mind,  consider  adding  a  communications  link  between  two  node 
elements; 
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VDD 


Figure  5.29.  Simulation  engine  incorporating  communication  link 

Since  the  system  must  already  accommodate  the  unpredictable  behavior  of  node  elements,  the 
demands  on  the  link  are  minimal;  messages  cannot  be  garbled  and  the  network  cannot  become 
partitioned  indefinitely.  However,  messages  can  be  dropped  or  delivered  in  any  order  since  these 
failures  do  not  affect  the  monotonicity  of  a  node’s  value. 

Two  important  questions  remain  to  be  answered  about  parallel  implementations  that  allow 
conflicting  updates: 

(1)  Is  there  an  analog  for  Lemma  5.3? 

(2)  Docs  this  parallel  implementation  give  the  same  answer  as  the  sequential 
implementation? 

The  author's  speculation  is  that  both  questions  can  be  answered  affirmatively.  This  belief  is  based  on 
the  observations  that  no  information  is  lost  that  cannot  be  recalculated,  and  the  operation  of  the 
switches  and  merging  of  results  remains  unchanged.  Given  that  the  order  in  which  the  propagation 
happens  was  shown  to  be  irrelevant  by  Theorem  5.4,  it  seems  unlikely  that  the  slightly  more  baroque 
propagation  mechanism  of  a  parallel  implementation  would  seriously  change  the  picture. 


Simulation  Using  a  Pre-compiled  Network  Model 


The  simulation  algorithms  presented  in  previous  chapters  rely  on  examination  of  die  surrounding 
network  to  determine  the  value  of  a  given  node.  The  surrounding  network  is  re-examined  every  time 
the  node's  value  needs  recalculation.  This  chapter  investigates  breaking  this  process  into  two  steps:  a 
single  complete  network  analysis  which  builds  a  set  of  four  logic  equations  for  e3ch  node,  indicating 
the  types  of  connections  between  the  node  and  vdd  orGND;  and  simulation,  where  the  value  of  each 
node  is  determined  by  evaluating  its  equations  built  during  the  first  step.  Not  only  is  the  overhead  of 
a  tree  walk  avoided  each  time  a  node  value  is  calculated,  but  evaluating  logic  equations  is  also  a  very 
fast  operation  for  most  computers. 

Each  step  is  discussed  in  a  separate  section.  The  first  section  describes  the  derivation  of  logic 
equations  for  each  network  node  —  even  those  which  are  not  directly  outputs  of  MOS  logic  gates.  The 
second  section  presents  several  approaches  for  building  a  logic  simulator  based  on  the  evaluation  of 
the  node  equations. 
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6.1.  Reducing  snitch  paths  to  logic  equations 

I  he  switch-level  algorithm  in  Chapter  5  determines  die  value  of  a  node  from  information  about 
the  node's  current  connection-  to  win  and  ij\n.  I  he  information  is  recatheicd  each  time  a  non  value 
is  calculated  tor  the  node.  In  most  cases,  only  a  small  numher  of  potential  paths  oust  from  a  node  to 
Win  and  (.\n  I  liis  suggests  that  it  might  he  economical  l<>  determine  ahead  of  time  the  conditions  for 
whiJi  a  path  exists  to.  sa\.  (AH.  I  or  example  the  output  ot  a  V)!<  gate  with  inputs  \  and  B  is  pulled 

down  if  either  \  or  H  is  non-tl.  I  he  existence  of  a  pulldown  path  ^aii  he  determined  h;.  evaluating  the 

expression  "A  OK  B":  a  search  of  the  network  is  not  required  to  discos  or  whxh  pulldossns  are 
currently  conducting. 

Hus  section  describes  tlie  derisation  of  a  set  of  four  Boolean  equations  for  each  node: 

DHA  An  expression  indicating  under  what  conditions  a  path  of  conducting  n- 
chunncl  and/or  p-channel  deuces  exists  from  node  A  to  VDD. 

DLA  An  expression  indicating  under  what  conditions  a  path  of  conducting  n- 
channel  and/or  p-channcl  deuces  exists  from  node  A  to  GSD. 

WHa  same  as  DHA .  except  the  path  contains  at  least  one  depletion  device. 

B  La  same  as  Dl.A.  except  the  path  contains  at  least  one  depletion  device. 

If  an  expression  evaluates  to  true  (1),  the  corresponding  path  exists;  if  the  expression  evaluates  to  false 
(0).  no  path  exists.  Since  nodes  can  have  X  values,  expressions  involving  node  values  can  evaluate  to 
X;  in  this  case,  the  corresponding  path  may  or  may  not  exist.  The  equations  involve  the  ordinary 
Boolean  operators  and  ("•”),  OR  (”  +  "),  and  NOT  These  operations  are  easily  extended  to 

accommodate  X  values: 


AND 

0 

1 

X 

OR 

0 

1 

X 

NOT 

0 

0 

0 

0 

0 

0 

1 

X 

0 

1 

1 

0 

1 

X 

1 

I 

1 

1 

1 

0 

X 

0 

X 

X 

X 

X 

1 

X 

X 

X 

The  algorithm  for  constructing  logic  equations  is  similar  to  that  for  computing  the  Thevenin 
equivalent  for  a  node  (see  section  4.1.2).  The  algorithm  begins  with  an  expanding  tree  walk,  stopping 
when  an  input  or  dead-end  is  reached.  During  the  tree  walk,  all  switches  are  assumed  to  be  on,  since 
the  tree  walk  is  performed  before  any  node  values  arc  calculated.  (During  simulation,  the  actual  state 
of  the  switch  is  represented  symbolically  in  the  equation.)  The  algorithm  continues  by  retracing  the 
steps  of  the  tree  walk  back  toward  the  original  node;  during  this  process,  the  equations  arc  built.  The 
equations  for  the  terminal  nodes  arc  trivial;  the  following  table  is  die  analogue  of  figure  4.8: 
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terminal  node  DH  1)1  (17/  (17. 

\nn  1000 

GM)  0  10  0 

dead-end  0  0  0  0 

Merging  die  equations  for  two  (or  more)  paths  which  join  at  a  given  node  occurs  in  several  steps. 


(a)  two  paths  to  merge  (b)  after  incorporating  switches  (c)  final  path  equations 


Figure  6.1.  Merging  the  equations  for  mo  paths  which  join 

The  process  begins  by  modifying  the  equations  for  each  path  to  reflect  the  contribution  of  the  switch 
in  scries  with  the  path  (figure  6.1(b)).  The  necessary  formulas  appear  below.  For  example,  DH  is  the 
new  equation  derived  by  combining  DH  with  gate,  the  value  of  the  switch's  gate  node. 

DH  ■  gate  n- channel  switch 

DH  =  DH  ■  ~<gate  p-channel  switch  (6.1) 

0  depletion  switch 

DL  ■  gale  n-channel  switch 

DL  =  DL  ■  ~>gate  p-channel  switch  (6.2) 

0  depletion  switch 

The  equations  for  the  "strong"  paths  (above)  are  straightforward;  when  the  connection  is  made  by 
regular  switch,  the  path  equation  and  the  the  switch's  gate  value  are  combined  using  and.  If  the 
connection  is  made  with  a  depletion  device,  the  strong  path  is  terminated.  Fqua  .ons  for  "weak"  paths 
(below)  arc  slightly  more  complicated  since  a  depletion  switch  changes  a  strong  path  into  a  weak  one. 
These  formulas  also  reflect  the  fact  that  a  strong  path  overpowers  a  weak  path.  Le.,  equations  for  weak 
paths  are  forced  to  0  if  a  strong  path  is  present.  The  reason  for  this  extra  complication  will  be  clear  in 
an  example  below. 


n-channel  switch 
p-chamwl  switch 
depletion  switch 


(6.3) 


1)7/ 


gate  ■  WH  •  ~>DI. 

-i gate  ■  M  7/  •  — >  /)/ 
DH  +  ( 1(  7/  ■  '  Dl. ) 
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gate  -  M 7  •  ~*DH  n-channel  switch 

~>gatc  ■  H'l  ~>l)ll  in  channel  switch 
Dl  +()!  /  •  — •/)//)  depletion  switch 


(6.4) 


After  the  equations  for  eaeh  path  arc  modified  to  ineludc  the  series  switches.  they  arc  combined  (using 
OR)  to  derive  the  final  equations  for  the  node,  as  shown  in  figure  6.1(c).  When  the  anal>sis  for  a  node 
is  complete,  the  four  equations  characterize  all  paths  from  the  node  to  VDD  and  GND. 


(a)  original  network  (b)  network  after  analysis  is  complete 

Figure  6.2.  The  four  equations  characterize  all  paths  from  node 


In  other  words,  for  each  node,  the  surrounding  network  (figure  6.2(a))  has  been  reduced  to  an 
•'  mivalent.  but  much  simple  network  (figure  6.2(b)).  All  the  information  about  paths  in  the  original 
network  is  now  stored  in  the  node  equations,  where  it  can  be  efficiently  utilized.  For  example,  to 
determine  if  a  node  is  pulled-down,  all  one  has  to  do  is  evaluate  the  DL  equation  —  no  examination 
of  the  network  is  necessary. 

The  value  of  node  can  be  determined  from  the  values  of  the  four  equations  and  the  node’s 
previous  value,  by  table  lookup: 


-  123  - 


D1I/W1I 


[ 

00 

01 

ox 

10 

11 

IX 

xo 

XI 

XX 

\  00 

prev 

1 

prev  +  X 

1 

1 

1 

prev  ■>-  X 

1 

prev  «  X 

01 

0 

X 

X 

1 

1 

1 

X 

X 

X 

1  ox 

prev  '  X 

X 

X 

I 

1 

1 

X 

X 

X 

*  10 

0 

0 

0 

X 

X 

X 

X 

X 

X 

r  Dl./WL  11 

0 

0 

0 

X 

X 

X 

X 

X 

X 

}  IX 

0 

0 

0 

X 

X 

X 

X 

X 

X 

l  xo 

prev  ’  X 

X 

X 

X 

X 

X 

X 

X 

X 

|  XI 

0 

X 

X 

X 

X 

X 

X 

X 

X 

'[  XX 

pres  '  X 

X 

X 

X 

X 

X 

X 

X 

X 

Table  6.1.  blade  value  table  for  equal  ion- based  simulation 
There  are  a  few  special  cases  which  can  be  summarized  more  concisely,  t  For  most  nodes  in  n.MOS 
circuits,  DH  =  WL  =  0,  Le..  connections  to  vdd  are  made  only  through  depletion  pullups,  and 
depiction  devices  arc  not  used  elsewhere  in  the  circuit.  In  this  case,  the  value  of  a  node  is  given  by  a 
single  equation: 

node  value  =  ( WH  +  previous  value)  ■  ~>DL  (when  DH  =  WL  =  0)  (6.5) 

Equation  6.5  can  be  simplified  further  for  a  node  that  is  directly  pulled  up  (WH  -  1),  Le.,  a  node 
which  is  the  output  of  a  logic  gate: 

node  value  =  ~>DL  (when  DH  =  WL  =  0  and  WH  =  1)  (6.6) 

In  most  cases,  therefore,  calculating  the  value  of  a  node  requires  evaluating  only  a  single  equation. 

Some  examples  will  help  illustrate  the  analysis.  First,  consider  an  inverter  with  a  pass  gate 
connected  to  its  output. 


A 


DH  =  0 
DL  =  B  A 
WH  =  B  A 
WL  =  0 


Figure  6.3.  Logic  equations  for  output  of  inverter  with  series  pass  gate 


tCurrcm  hardware  simulation  engines  [Pfistcr82,  7.ycad83]  implement  all  functions  through  table  lookup,  so  they  can 
implement  the  function  tabled  above  as  efficiently  as.  say.  Boolean  operations  This  is  not  true  of  most  general- 
purpose  machines:  hence  the  motivauon  for  finding  simpler  representations  where  possible. 


i 
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Using  equation  6.5,  the  value  of  C  is  given  by  C  =  (B  ■  ~<A  +  C)-  ~'(B  ■  A).  I  he  value  of  this 
equation  is  tabled  below  for  die  various  values  of  A  and  B. 


C' 

0 

B 

1 

X 

0 

c 

1 

c+x 

A  1 

c 

0 

C  •  X 

X 

c 

X 

X 

When  B  is  0.  die  pass  gate  is  turned  off.  and  C  retains  its  old  value.  When  B  is  1.  die  pass  gate  is  on. 
and  C  is  the  complement  of  A.  Finally,  when  B  is  X,  C  is  also  X.  except  when  the  output  of  the 
inverter  is  the  same  as  the  previous  value  of  C.  In  this  case,  the  output  retains  its  old  value,  which 
makes  sense  since  there  is  nothing  forcing  it  to  change.  This  last  statement  is  true  only  because 
H  He  -  B  •  ~>A ;  the  term  forces  the  pullup  equation  to  0  when  the  pulldown  of  the  inverter  is 
active.  If  the  117/  equation  did  not  reflect  the  contribution  of  the  pulldown,  ie.,  if  WHc  -  B,  the 
value  C  would  be  unnecessarily  forced  to  X  when  the  value  of  B  was  X. 

The  next  example  is  the  XOR  gate  presented  in  Chapter  2. 


Figure  6.4.  XOR  logic  gale 


The  equations  for  each  node  appear  in  the  following  table. 
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node  DH  1)1.  117/  WL 

C  0  A  +  1TOB  1  0 

D  0  B+CDA  1  0 

K  0  Cll  +  DA  1  0 

F  0  H  1  0 

these  equations  might  seem  incorrect  at  first  —  it  is  not  at  all  obvious  that  /'  =  AxorB.  However 
simplifying  the  the  equations  for  C  and  D  shows: 

C  =  ~>(/l  +  l)  ■  C  ■  B)  =  _I(/t  +  —>(8  +  C  ■  A)  ■  C  ■  B)  =  A  (6.7) 

and  similarly,  D  =  ~'B.  These  results  can  be  used  to  rewrite  the  equation  for  Fin  terms  of  A  and  B: 

F  =  ~i  E  =  C  •  B  +  D  ■  A  =  -i  A  ■  B  +  ~i  B  ■  A  =  A  XOR  B  (6.8) 

In  actual  use,  the  equations  are  not  simplified.  The  above  substitutions  do  verify,  however,  that  the 
equations  compute  the  correct  value  for  F. 

Some  circuit  configurations  have  very  simple  connection  p^ths  during  actual  operation  of  the 
circuit,  but  the  circuits  can  appear  very  complicated  when  no  information  is  known  about  the  values  of 
various  control  lines.  This  is  especially  true  of  a  circuit  containing  n.MOS  switching  logic,  such  as  a 
barrel  shifter  or  tally  circuit.  If  no  information  is  available  about  the  values  of  the  control  lines  in  a 
barrel  shifter,  it  appears  to  short  together  all  the  incoming  and  outgoing  data  bits.  The  logic  equations 
for  a  node  in  such  a  circuit  can  become  very  large  —  in  some  cases,  large  enough  to  be  impractical. 
The  analysis  procedure  monitors  the  size  of  the  equations  under  construction.  If  they  grow  too  large, 
the  procedure  is  aborted  and  the  node  is  flagged.  At  simulation  time,  the  value  of  a  flagged  node  is 
determined  using  the  normal  switch-level  simulation  algorithm. t  Flagging  a  small  number  of  nodes 
eases  the  analysis  of  the  remainder  of  the  circuit.  (The  number  of  flagged  nodes  has  been  less  than 
1%  of  the  total  number  of  nodes  in  all  the  designs  processed  to  date.)  Using  this  technique,  the  speed¬ 
up  in  simulation  afforded  by  the  use  of  logic  equations  can  be  enjoyed  by  circuits  even  where  100% 
conversion  to  equations  is  not  possible. 

Keeping  track  of  gate  expressions  for  transistors  crossed  during  the  initial,  expanding  phase  of 
the  tree  walk  allows  the  equation-building  algorithm  to  eliminate  duplicate  and  terms  in  the  results. 

tReversion  to  ordinary  switch -level  simulation  (or  especially  complicated  circuits  Ls  easily  accomplished  by  general- 
purpose  computers,  but  can  be  next  to  impossible  for  special-purpose  hardware. 
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A  B  A  A  B 


(a)  original  circuit  (b)  reduced  circuit 

Figure  6.5.  Ou-ihe-JI Y  elimination  of  duplicate  AM)  terms 

This  minor  optimisation  can  reduce  equation  si/c  substantial!)  in  some  circuits.  Consider,  for  example, 
a  tally  circuit  from  (McadSOj. 

Z3 

Z2 

Z1 

zo 


E  C  A 

Figure  6.6.  Tally  circuit 

This  tally  circuit  has  three  inputs:  A,  C,  and  E.  A  tally  circuit  counts  the  number  of  1-inputs;  ZO  =  1 
when  no  inputs  are  high,  Z1  =  1  when  exactly  one  input  is  high,  and  so  on.  The  equations  produced 
for  the  outputs  appear  somewhat  complicated,  for  example: 

DLZ i  =  B-(A  +D(C  +  F  +  EF)+C(D+E  +  F-E))  +  A  (B  +C  +  D(C  +  E  +  FE))  (6.9) 

WHZ  i  =  B(DE+CF  +  A-CE)  +  AD(F +  C(E  +  BE))  (6.10) 

These  equations  are  hard  to  verify  as  they  arc,  but  they  can  be  simplified  by  removing  B,  D,  and  F. 
(Again,  the  simulator  does  not  simplify  the  equations,  but  this  is  the  easiest  method  for  us  to  use  to 
verify  the  operation  of  the  algorithm.)  Using  the  identities  R  -  ~>A .  D  =  ~<C ,  and  F  =  ->E.  the 
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equations  reduce  to: 

DLy,  i  =  -i A  --'C  ->E  +  ->A  C  I:  +  A-C  +  A~^C-F.  (6.11) 

WHy  i  =  ”M  •”'{'•£  +  ->A-C~'F  +  A  >F  (6.12) 

Substituting  these  formulas  into  equation  6.5  gives 

/A  =  A  ~>(  ~>l-  +  ->A  C  ~'I'  +  (6.13) 

As  expected.  7.1  is  true  if  exactly  one  input  is  high.  Of  course,  evaluating  this  last  equation  would  be 
much  faster  than  using  the  original  equations,  6.9  and  6.10.  Unfortunately,  equation  simplification  is  a 
very  time  consuming  operation;  the  computational  investment  required  to  process  all  the  equations  for 
a  large  circuit  would  probably  not  be  recovered  by  decreased  simulation  time.  In  addition,  the 
equations  for  most  nodes  are  simple,  and  simplification  beyond  that  suggested  by  equation  6.6  (a 
simplification  which  is  easily  recognized)  docs  not  result  in  much  improvement. 

6.2.  Compiling  logic  equations  for  simulation 

It  is  easy  to  build  a  simulator  that  uses  the  node  equations  developed  in  the  previous  section. 
The  simplest  approach  [Dcnneau82]  is  to  allocate  two  node-value  arrays;  one  to  hold  the  current 
values  of  each  node,  and  the  other  to  collect  new  node  values  as  they  are  computed.  Bach  node  is 
assigned  an  index  which  can  be  used  to  access  its  current  value  in  the  first  array,  or  to  store  its  new 
value  in  the  second  array.  A  simulation  subroutine  for  the  network  is  built  by  generating  code  that 
calculates  the  value  of  each  node,  where  the  code  for  one  node  is  followed  by  the  code  for  the  next. 
(Since  new  node  values  are  kept  separate  from  the  current  node  values,  the  order  in  which  nodes  are 
processed  by  the  compiler  does  not  matter.)  A  single  simulation  step,  which  propagates  new  input 
values  to  other  nodes  in  the  network,  is  implemented  as  follows: 

(1)  For  each  input  node,  set  its  current-value  array  entry  to  the  designated  input 
value. 

(2)  Execute  the  simulation  subroutine.  This  fills  the  new-value  array. 

(3)  Compare  the  current-value  and  new-value  arrays.  If  their  contents  are  identical, 
the  network  has  settled  and  the  simulation  step  is  over.  Otherwise  copy  the 
new-value  array  to  the  current-value  array,  and  return  to  step  (1). 

This  simulation  algorithm  has  several  interesting  properties.  Each  execution  of  the  simulation 

subroutine  corresponds  to  one  step  of  a  unit-delay  simulator.  Node  values  are  updated  all  at  once  in 
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step  (3);  hence,  the  simulator  implements  a  true  unit-delay  algorithm  as  described  in  section  5.3.3. 
Note  that  no  special  handling  of  input  nodes  is  required  when  generating  code  —  the  new  \alucs 
calculated  for  input  nodes  in  step  (2)  arc  overridden  by  user-specified  values  in  step  (1).  Note  also 
that  the  calculations  of  the  simulation  subroutine  are  not  event  driven;  the  implications  are  discussed 
below. 

Ihe  value  of  a  node  is  computed  from  its  four  node  equations,  using  the  code  generated  by  one 
of  the  following  alternatives: 

(1)  If  DH  =  iVI  =  0  and  WH  ~  1.  emit  code  that  calculates  the  node  value  using 
equation  6.6. 

(2)  If  DL  -  IV L  =  0  and  WH  *  1.  emit  code  that  calculates  the  node  value  using 
equation  6.5. 

(3)  Otherwise,  emit  code  which  evaluates  each  of  the  four  node  equations,  and  then 
concatenates  the  resulting  values  with  the  previous  value  of  the  node  to  create  an 
index  into  Table  6.1.  As  an  optimization,  the  code  generator  can  check  for  other 
special  eases  (constant  values  for  WH  and  MX)  and  generate  accesses  to  smaller 
tables  if  appropriate. 

Code  is  generated  for  each  equation  using  standard  compilation  techniques.  The  logic  instructions  of 
I've  target  machine  are  used  for  expression  evaluation.  (Some  provision  must  be  made  to  incorporate 
X  values  in  a  way  that  still  permits  use  of  the  native  logic  instructions;  see  the  example  at  the  end  of 
this  section.)  Access  to  a  node's  current  value  requires  only  an  indexed  reference  into  the  current-value 
array  ;  storing  generated  values  requires  an  indexed  reference  to  the  new-valuc  array. 

There  are  some  inefficiencies  inherent  in  this  approach.  An  extra  execution  of  the  simulation 
subroutine  is  performed  during  each  simulation  step  —  "extra"  in  the  sense  that  the  last  execution 
produces  the  same  result  as  the  one  before  (that  is  how  the  simulator  identifies  it  as  the  last 
execution).  In  addition,  the  value  of  each  node  is  calculated  during  each  call  to  the  simulation 
subroutine,  even  if  the  inputs  to  the  node's  equations  have  not  changed. 

This  last  objection  can  be  addressed  by  making  a  more  intelligent  choice  about  the  order  in 
which  node  values  are  calculated,  by  identifying  the  nodes  that  affect  node  /4  s  value  (i.e.,  nodes  that 
appear  in  the  equations  for  A )  and  then  generating  code  to  compute  the  values  of  these  nodes  before 
generating  code  to  compute  the  value  of  A  [Casc78,  Dennc3u82],  In  addition,  references  to  a  node’s 
current  value  are  directed  to  the  new-value  array  if  a  new  value  for  the  node  was  computed  earlier  in 
the  subroutine.  For  example,  the  circuit  in  the  following  figure  has  several  cascaded  logic  gates. 
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figure  6.7.  Cascaded  logic  gates 

I'nder  the  new  organization,  the  compiler  generates  code  for  nodes  .1  and  II  before  generating  code 
for  node  E,  and  so  on.  The  resulting  code  propagates  a  new  input  value  from  A  to  //  in  a  single 
execution.  (The  earlier  scheme  would  have  required  three  calls  to  the  simulation  subroutine  to  achieve 
the  same  effect.) 

To  implement  this  scheme,  the  compiler  assigns  a  numeric  level  to  each  node.  The  level  of  input 
nodes  is  defined  to  be  0;  the  level  of  a  non-input  node  a  is 

level(a)  =  1  +  max(  level  of  nodes  affecting  a  )  (6.14) 

Referring  to  the  example  in  figure  6.7,  if  nodes  A  through  D  arc  inputs,  level(E)  -  1  and 
level(H)  =  3.  Code  is  first  generated  for  level  1  nodes,  then  level  2  nodes,  and  so  on.  When 
compiling  an  equation,  if  a  node  value  is  needed,  the  node's  level  determines  where  that  value  comes 
from.  The  value  of  a  level  0  node  is  taken  from  the  current-value  array,  and  the  value  of  a  node  with 
a  level  greater  than  0  is  taken  from  the  ncw-value  array.  (New  values  are  stored  in  the  ncw-value 
array,  as  always.) 

The  definition  of  a  node's  level  in  equation  6.14  runs  into  some  difficulty  if  the  circuit  has 
feedback.  Consider,  for  example,  the  following  circuit: 


In  attempting  to  assign  a  level  to  node  A',  one  discovers  that  the  definition  is  circular,  Le.,  the  level  of 
node  K  is  defined  in  terms  of  itself.  The  compiler  solves  this  problem  by  arbitrarily  splitting  a  node 
that  is  in  the  feedback  loop  into  two  nodes.  One  copy  is  treated  as  an  input  and  the  oilier  as  a 


-  130- 


normal  network  node.  Both  arc  assigned  the  same  index  so  that  the  input  value  is  updated  each  lime 
the  new-valuc  array  is  copied  to  the  current-value  array.  'I'hus.  the  circuit  in  figure  6.8  is  compiled  as 
if  it  had  tire  following  configuration: 


treated  as  input 


X 


K 

J 


value  fed  back  during  step  (3) 


Figure  6.9.  Feedback  circuit  as  it  appears  tu  the  compiler 


For  the  purposes  of  compilation,  the  feedback  loop  is  broken:  the  value  is  actually  fed  back  during 
step  (3)  above  when  the  new-valuc  array  is  copied  to  the  current-value  array.  This  means  that  a 
circuit  containing  feedback  might  require  more  than  a  single  execution  of  the  simulation  subroutine 
before  the  network  settles.  As  it  turns  out,  most  MOS  circuits  contain  feedback  loops  since  charge 
decay  requires  that  storage  nodes  be  refreshed.  A  clocked  feedback  loop  offers  special  compilation 
opportunities,  which  arc  discussed  below. 

Compiling  nodes  by  level  ensures  that  only  a  single  execution  of  the  simulation  subroutine  is 
needed  to  settle  the  network,  assuming  the  network  contains  no  feedback.  The  new  organization 
introduces  other  differences  from  the  original  compilation  strategy.  Node  values  are  not  updated  all  at 
once  in  this  scheme;  the  simulation  subroutine  implements  a  pseudo  unit-delay  simulation.  Input 
nodes  must  be  assigned  a  level  of  0,  which  means  nodes  must  be  declared  as  inputs  before  the 
compilation  process  begins.  This  eliminates  the  possibility  of  interactive  debugging,  where  one  wants 
the  capability  to  consider  any  node  as  an  input  Typically,  the  designer  uses  the  original  compilation 
strategy  when  initially  checking  out  the  circuit,  and  then  uses  compilation-by-lcvel  when  performing 
long  verification  runs. 

Most  node-value  references  are  satisfied  using  the  new-value  array  in  the  compilation-by-level 
scheme.  This  suggests  that  is  might  be  worthwhile  to  eliminate  the  storage  overhead  and  copying  time 
involved  for  managing  two  atTays  by  merging  them  into  a  single  array.  This  is  straightforward, 
provided  a  new  technique  is  developed  for  detecting  when  the  simulation  step  is  complete.  If  the 
circuit  has  no  feedback,  only  a  single  execution  of  the  code  is  needed.  When  there  is  feedback,  a 
single  execution  also  suffices,  if  the  current  and  new  value  of  split  nodes  (e.g.,  K  and  A'  in  figure  6.9) 
agree.  Only  when  the  old  and  new  values  arc  different  is  another  execution  required.  This  can  be 
arranged  by  comparing  the  two  values  before  the  new  value  is  stored  into  the  array.  If  the 
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comparison  shows  them  to  be  unequal,  a  flag  is  set  to  indicate  that  another  execution  is  needed.  Note 
that  the  whole  simulation  subroutine  is  rc-exccutcd;  this  is  simpler  than  trying  to  untangle  interlocking 
feedback  loops  to  determine  the  subset  of  the  code  that  must  be  rc-exccutcd. 

With  this  improvement,  the  compilation-by-levcl  scheme  produces  a  simulation  subroutine  that: 

<i)  uses  a  single  node-value  array. 

(u)  evaluates  nodes  in  a  reasonable  order:  the  values  of  a  node's  inputs  are 
calculated  before  the  value  of  the  node  itself  is  calculated. 

(iii)  deals  with  feedback  by  splitting  some  node  in  the  feedback  loop  into  an  input 
node  (assigned  level  0)  and  a  regular  node.  Both  nodes  arc  assigned  the  same 
index,  so  when  the  value  of  the  regular  node  is  recomputed  it  updates  the  value 
of  the  input  node  also.  Before  storing  the  value  of  a  split  node  into  the  node¬ 
value  array,  it  is  compared  with  the  current  value;  if  the  values  are  different  a 
flag  is  set 

(iv)  uses  the  flag  described  in  step  (iii)  to  indicate  when  another  iteration  is  needed. 

If  the  flag  is  set  during  an  execution  of  the  code,  another  iteration  is  performed; 
otherwise,  the  subroutine  is  finished. 

The  following  is  an  extended  example  which  illustrates  the  result  of  a  compilc-by-lcvel  for  a  single  bit 
in  a  n.vios  counter.  The  circuit  diagram  for  the  counter  bit  is  shown  in  the  following  figure. 

PH12 


F 


Figure  6.10.  Circuit  diagram  for  a  one-bit  counter 


The  target  machine  for  this  example  is  the  DEC  VAX-11.  A  node  value  is  2-bit  quantity  (logic  low  = 

0,  logic  high  =  3,  X  =  1)  stored  in  a  byte  location;  the  node-value  array  is  implemented  as  an  array 

of  bytes.  Logical  ANDf  and  OR  instructions  produce  the  desired  answers  with  this  value  encoding. 

However,  using  this  encoding,  the  complement  instruction  does  not  correctly  implement  the  NOT 

tThc  VAX  docs  not.  in  fact,  have  an  AND  instruction  Instead,  a  "bit  clear"  (B1C  in  VAX  parlance)  is  provided, 
which  implements  an  AND-COMPI.I  MFNT operation  This  introduces  a  few  circumlocutions  in  the  generated  code. 
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operation,  so  NOT  is  performed  by  table  lookup.  The  index  of  each  node  is  indicated  symbolically  in 

the  code  below  (the  index  of  node  A  is  written  "jA”). 

;  rlO  =  point»r  to  value  array 
,  ntbl  -  table  giving  NOT  of  value 
.  xtbl  =  table  giving  bit  complement  of  value 
;  xntbl  =  tablegivingbit  complement  ofNOTofvalue 

step: 


clrl 

rO 

;  so  rags  can  be  used  index  registers 

clrl 

r  1 

movb 

#1 , i terate^f lag 

;  nontero  indicates  no  iteration  needed 

movb 

PHI2( rlO ) , rC 

bisb3 

ntbl(rO).  OUT ( rlO ) . rO 

;  rO  =  !phi2  +  out  =  1 (phi 2  •  lout) 

movb 

OUT ( rlO) , rl 

b  ieb3 

xtbl(rl),_PH12(rlO),rl 

bisb2 

IN( rlO) , rl 

;  rl  =  (pM2  *  out)  +  in 

bisb3 

xtbl ( rl ) , rO IN( rlO) 

;  in  =  rO  •  rl 

movb 

IN(rlO),rO 

movb 

ntbl(rO),_A(rlO) 

;  a  =  Mn 

movb 

PHIl(rtO),rO 

bisb3 

ntbl(rO).  A( rlO) , rO 

;  r 0  =  Iphll  +  a  =  ! (phi  1  •  !a) 

movb 

.  A(rlO).rl 

bicb3 

xntbl ( rl ) .  PHI1( rlO) , rl 

b1sb2 

B(  rlO) , rt 

;  rl  =  (phil  •  !a)  +  b 

b1cb3 

xtbl  ( rl) ,  rO  ,_8(  rlO) 

;  b  =  rO  •  rl 

movb 

B(rlO),rO 

movb 

ntbl ( rO)  ,_C( rlO) 

;  c  a  !b 

movb 

C(rlO) ,r0 

b1cb3 

xtbl (rO )  ,_CIW(rX0) , rO 

movb 

ntbl ( rO)  ,_0( rlO) 

;  d  =  !(c  •  cln) 

movb 

C(rlO),rO 

bicb3 

xtbl ( rO) ,  D(rlO),rO 

movb 

ntbl ( rO) ,_E( rlO) 

;  e  =  !(c  •  d) 

movb 

D(  rlO) , rO 

bicb3 

xtbl(rO).  CIN(rlO),rO 

movb 

ntbl(rO),  f(rlO) 

;  f  =  !(d  •  cln) 

movb 

O(rlO) ,r0 

movb 

ntbl ( rO) .COUT ( r 10 ) 

:  cout  =  id 

movb 

_  E  ( r  10 ) ,  rO 

bicb3 

xtbl ( rO ) . _  F ( rlO) , rO 

empb 

ntbl(  rO),_OUT(rlO) 

;  check  I(e  •  f )  against  old  value 

beql 

If 

movb 

ntbl(rO)._OUT(rlO) 

;  If  different,  save  new  value 

clrb 

iterate.flag 

;  and  set  iterate  flag  so  we  do  it  again 

bbes 

#1 . Iterate.f lag , lb 

;  check  flag.  Iterate  if  set 

rsb 

The  code  is  a  relatively  straightforward  implementation  of  the  equations  for  each  node.  Nodes  PH  1 1, 
PHI2,  and  CIN  arc  designated  as  input  nodes.  Note  that  the  feedback  loop  is  broken  by  splitting 
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node  OI  T.  an  arbitral  choice.  The  resulting  simulation  is  several  orders  of  magnitude  more  efficient 
than  a  standard  switch-level  simulation.  For  example,  the  value  of  B  is  calculated  in  six  instructions, 
the  value  of  ('  in  only  two.  The  code  is  also  relatively  compact  compared  to  the  usual  network  data 
base. 

Although  compiling  by  level  greatly  reduces  the  amount  of  wasted  computation,  there  arc  still 
occasions  when  the  values  of  nodes  are  unnecessarily  calculated.  Some  input  transitions  have  little 
effect  on  node  values;  e.g.,  when  PH  11  or  PH  12  in  the  one-bit  counter  above  change  from  1  to  0. 
This  suggests  that  tire  performance  of  the  simulator  can  be  improved  by  generating  multiple  simulation 
routines,  where  each  routine  corresponds  to  a  fixed  value  for  one  or  more  inputs.  This  is  pa">cularly 
advantageous  when  the  inputs  selected  for  special  processing  have  a  major  impact  on  the  circuit  to  be 
simulated.  For  example,  in  a  circuit  using  two  clocks,  three  separate  simulation  routines  can  be 
generated;  one  generated  assuming  both  clocks  are  low  (called,  say,  clockoo),  and  the  other  two 
generated  assuming  one  of  the  clocks  was  high  (clockio  and  CLOCKOi).  A  four-phase  clock  cycle  is 
simulated  by  executing  the  simulation  subroutines  in  the  correct  order: 


jsb 

clocklO 

;  PHI 1  high 

jsb 

clockOO 

;  both  clocks  low 

jsb 

clockOl 

;  PHI2  high 

jsb 

ClockOO 

i  both  clocks  low 

To  generate  a  input-specific  simulation  routine,  the  user  specifies  which  nodes  are  inputs,  and  for  each 
input 

(1)  gives  the  input’s  logic  value,  and 

(2)  indicates  whether  the  input  is  stable  or  has  just  changed  to  the  specified  value. 

The  compiler  applies  several  optimizations  during  code  generationt:  constant  folding  based  on 
knowledge  of  input  node  values,  and  compile-time  selective  trace  that  ignores  nodes  whose  values 
remain  unchanged.  (The  stablc/changing  specification  is  used  by  the  selective  trace  optimization.)  The 
selective  trace  is  especially  effective  in  reducing  the  amount  of  generated  code. 

In  the  examples  below,  PHI1  and  PH  12  are  specified  as  changing  inputs,  and  C1N  an 
unchanging  input.  The  first  example  —  the  code  generated  for  the  one-bit  counter  with  both  clocks 
low  —  illustrates  just  how  effective  the  optimizations  can  be: 

tThc  optimizations  arc  inspired  by  those  found  in  traditional  optimizing  compilers  [Harnson77,  Wulf75]  Because  of 
the  branch-free  nature  of  the  code  and  the  pcnasive  influence  of  clock  signals,  many  of  the  optimizations  are  much 
more  cffecuve  in  this  domain  than  in  traditional  compilation  problems 
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clockOO:  ;  code  for  phll  =  0,  p h  1 2  =  0,  cln  =  1 

el  rb  PHIl(rlO)  ;  phll  =  0 

cl rb  PH 1 2 ( rlO)  ;  ph 12  =  0 

rsb 

The  values  of  PHII  and  PHI2  are  set  by  the  code  since  they  are  specified  as  changing  inputs.  (The 


value  an  unchanging  input  is  assumed  to  be  set  by  the  user,  or  by  code  executed  earlier.)  Node  B  is 


determined  to  be  unaffected  by  the  change  in  PHII,  as  arc  nodes  IN  and  PH12.  In  fact,  the  compile¬ 
time  selective  trace  does  not  find  any  nodes  that  change  value,  except  for  the  changing  inputs. 

The  next  code  sequence,  corresponding  to  PHII  going  high,  is  somewhat  longer,  since  that  is  the 

transition  when  the  circuit  performs  most  of  its  work. 

clocklO: 

code  f or  ph 1 1  =  1,  p ft  1 2  =  0.  cln  =  1 

cl  rl 

r0 

so  reg  can  be  used  as  index  register 

movb 

#3.  PHIl(rlO) 

phll  =  1 

cl  rb 

PHI2( rlO ) 

phi  2  =  0 

movb 

A(rlO).  B(rlO) 

b  =  a 

movb 

B(  rlO) . rO 

movb 

ntbl ( rO ) ,_C( rlO ) 

c  =  lb 

movb 

C(  rlO) . rO 

movb 

ntbl ( rO ) ,_0( rlO ) 

d  =  l(c  •  cln)  =  Ic 

movb 

D(rl0),r0 

movb 

ntbl ( rO ) ,_C0UT ( rlO) 

coot  =  !  d 

movb 

C(r 10). rO 

b  icb3 

xtbl(rO)._D(rlO).rO 

movb 

ntbl ( rO } ,_£( rlO) 

e  =  !(c  •  d) 

movb 

D( rlO) , rO 

movb 

ntbl  ( rO)  ,.f(rl(J) 

f  =  f(d  *  cln)  =  fd 

movb 

E( rlO) . rO 

b  1cb3 

xtbl(rO).  F ( rlO ) , rO 

movb 

ntbl  ( rO ) ,_OUT ( rlO ) 

out  =  I  (e  *  f ) 

rsb 

A  node  that  connects  to  the  rest  of  the  network  through  a  single  pass  transistor  (e.g.,  node  B  in  the 

counter)  is  treated  specially  by  the  compiler,  because  such  nodes  are  so  common  in  MOS  networks. 

When  the  pass  transistor  is  turned  on  by  fixed-value  input,  the  genc'atcd  code  is  particularly  efficient 

(a  single  move  in  the  example  above). 

The  last  code  sequence,  corresponding  to  PH  12  going  high,  is  relatively  short:  the  compile-time 

selective  trace  finds 

only  a  few  nodes  whose  values  needed  to  be  computed. 

clockOl: 

code  forphll  =  0.  ph12  =  1,  cln  =  1 

cl  rl 

rO 

so  reg  can  be  used  as  index  register 

drb 

PHIl(rlO) 

phll  =  0 

movb 

#3 .  PHI 2 ( rlO) 

ph12  =  1 

movb 

OUT ( rlO ) ._ IN( rlO ) 

In  =  out 

movb 

IN{ rlO) , rO 

movb 

ntbl(rO).  A(rlO) 

a  =  1  In 

rsb 

Simulation  of  a  four-phase  clock  cycle  using  these  three  routines  requires  executing  only  36  VAX 

instructions.  The  earlier  compiled  code  sequence  requires  39  instructions  for  a  single  simulation  step. 
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for  a  tola]  of  more  than  150  executed  instructions  when  simulating  a  full  clock  c>clc.  Input-specific 
subroutines  result  in  a  considerable  improvement. 

Although  the  impact  of  compile-time  selective  trace  makes  it  a  worthwhile  optimization,  only  so 
many  input-specific  routines  can  be  generated.  Assuming  that  all  combinations  of  inputs  are  possible, 
the  number  of  routines  needed  grows  exponentially  with  the  number  of  fixed  inputs.  ITius,  while 
computations  caused  by  the  changing  of  a  few  inputs  can  be  reduced  to  the  bare  minimum,  many 
unnecessary  computations  are  still  performed.  For  example,  in  a  10-bit  counter,  the  nodes  comprising 
the  higher  data  bits  are  recomputed  during  each  clock  cycle,  even  though  those  nodes  actually  change 
value  far  less  frequently.  Presumably,  the  appropriate  checks  could  be  inserted  into  the  code,  resulting 
in  branches  around  sections  of  code  that  do  not  need  to  be  executed.  In  the  counter  example,  when 
the  carry-in  of  a  data  bit  is  i.ero,  the  code  for  its  level  and  all  higher  levels  docs  not  need  to  be 
executed.  However,  a  very  sophisticated  compiler  would  be  needed  to  handle  this  situation.  It  is 
unclear  what  further  gains  will  be  possible  in  the  search  to  reduce  unnecessary  computation. 

In  summary,  the  compilation  techniques  discussed  in  this  chapter  are  well-suited  for  producing 
code  that  implements  a  fast  switch-level  simulation  of  a  stable  design.  The  potential  increase  in 
simulation  speed  allows  more  exhaustive  checkout  than  is  possible  with  interactive  (and  slower) 
simulators.  Compilation-based  simulation  is  most  appropriate  for  a  circuit  with  a  high  degree  of  circuit 
activity;  if  each  circuit  component  is  active  during  each  simulation  step,  there  is  very  little  unnecessary 
computation  by  the  simulation  subroutine.  On  the  other  hand,  for  a  large  circuit  with  little  activity,  an 
event-driven  interactive  simulator  might  actually  outperform  a  compiled  simulation.  Fortunately,  not 
many  designers  strive  for  designs  in  this  latter  category. 
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CHAPTKR  SEVHN 


CONCLUSIONS 


The  models  and  simulators  presented  in  this  thesis  were  developed  to  fill  the  need  for  simulation 
tools  suitable  for  large  mos  designs.  At  the  outset  of  the  project,  there  were  surprisingly  few 
alternatives;  even  today,  much  of  the  work  in  the  area  of  simulation  tools  concentrates  on  refurbishing 
traditional  gate-level  simulators  and  circuit  analysis  programs.  (The  current  state  of  these  efforts  is 
outlined  at  the  end  of  the  chapter.)  The  work  reported  here  takes  a  different  approach,  seeking  to 
develop  new  algorithms,  guided  by  the  following  goals: 

(1)  The  algorithms  must  be  suitable  for  the  logic-lev  c!  simulation  of  large  digital  MOS 
circuits;  "large"  meaning  circuits  containing  10.000  to  50,000  transistors. 

(2)  Important  aspects  of  MOS  behavior  (bidirectionality,  charge  sharing/storage, 
pullup/pulldown  ratios,  etc.)  should  be  modeled  in  a  useful  way. 

(3)  Performance  estimates  should  be  calculated  directly  from  the  actual  parameters 
of  the  circuit  components.  Ideally,  the  calculations  arc  based  on  the  same  rules 
of  thumb  used  by  designers  when  estimating  circuit  performance. 

Tne  RSlM  simulator  meets  all  three  goals,  while  maintaining  a  reasonable  balance  between  simulator 

performance  and  accuracy  of  predictions.  Rather  than  performing  ..  detailed  simulation  of  each 

transistor’s  operation,  RSIM  uses  the  linear  model  to  directly  predict  the  logic  state  of  each  node  and  to 

estimate  transition  times  when  nodes  change  state.  The  net  effect  is  a  trade  of  some  prediction 

accuracy  for  an  increase  in  simulation  speed.  When  the  linear  model  is  conservatively  calibrated,  its 

predictions  can  be  used  to  identify  problem  circuits  in  need  of  more  accurate  analysis.  Usually,  a  large 
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percentage  of  a  circuit  passes  ihe  scrutiny  of  RSIM.  and  so  the  expense  associated  with  detailed 
simulation  of  the  whole  circuit  is  avoided.  In  addition  to  serving  as  the  basis  for  simulation,  the  linear 
model  can  be  used  in  timing  analysis  and  might  serve  to  quickly  generate  initial  waveforms  for  a 
relaxation-based  circuit  analysis  program. 

RSIM  has  been  in  use  in  both  university  and  industrial  environments  since  the  spring  of  1982. 
During  dial  time  it  has  simulated  several  hundred  designs,  ranging  in  si/c  from  very  small  to 
approximately  40.000  transistors.  Because  RSIM  is  fast  enough  to  simulate  a  whole  circuit,  it  often 
uncovers  circuit  flaws  that  have  fallen  between  the  cracks  during  the  simulation  of  smaller  pieces  of 
the  design.  The  trend  shows  that  RSIM  is  viewed  as  a  companion  to  circuit  analysis,  using  it  for  all 
logic-level  verification  and  preliminary  timing  analysts,  and  resorting  to  circuit  analysis  for  those  paths 
identified  as  critical  by  RSIM. 

Ihe  simulation  algorithm  is  embedded  in  a  USP-likc  command  language  perman82]  that  has 
been  used  to  write  quite  elaborate  programs  to  drive  the  simulation  and  process  the  results.  Since 
programs  to  prepare  simulation  input  arc  much  less  tedious  to  construct  than  the  input  itself,  designers 
have  been  able  to  conduct  more  tests  than  they  might  otherwise  do.  For  example,  it  is  a  simple  matter 
to  use  a  set  of  test  vectors  that  drive  a  registcr-iransfcr-levcl  simulation  as  input  to  an  rsim  run.  and 
compare  the  predictions  of  the  two  simulations,  all  under  program  control. 

With  careful  calibration,  RSIM's  predictions  for  combinational  logic  arc  within  30%  of  those  of 
SPiCT.  For  circuits  relying  on  analog  behavior  (sense  amplifiers,  bootstrapped  nodes,  etc.)  or  chains  of 
pass  devices,  the  predictions  are  less  accurate.  To  compensate,  several  "escape"  mechanisms  exist 
which  allow  the  designer  to  specify  the  logic  thresholds  and  transition  times  for  individual  nodes  so 
that  the  results  of  more  detailed  simulations  can  be  incorporated  into  rsim.  Usually  this  mechanism 
need  be  invoked  for  only  a  few  critical  nodes  (e.g.,  clock  driver  outputs).  Another  alternative  is  to 
identify  subcircuits  and  replace  them  with  logically  equivalent  circuits  that  can  be  simulated  easily;  a 
network  preprocessor  [Ilcr83]  that  performs  subcircuit  matching  and  replacement  is  available  and  has 
been  used  to  good  effect.  With  these  enhancements,  rsim  has  proved  to  be  a  fairly  reliable  filter  for 
detecting  circuits  in  need  of  more  careful  analysis. 

For  those  stages  of  the  design  process  that  do  not  require  performance  information,  a  switch 
model  might  be  more  appropriate  than  a  linear  model.  A  switch-level  simulation  is  particularly  useful 
in  the  early  stages  of  a  design  when  one  is  experimenting  with  the  organization  of  the  logic,  and  sizing 
each  device  would  be  distracting.  The  switch  models  presented  in  this  thesis  are  straightforward. 
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especially  m  the  treatment  of  X  values  .ind  tiicir  effect  on  the  network.  The  switch  model  as 
embodied  in  i  si\i  (which  uses  the  global  algorithm  outlined  in  Chapter  5)  is  quite  compatible  with  the 
linear  model  used  in  ksim.  In  fact,  in  the  current  implementation  both  models  exist  side  by  side  and 
one  can  choose  cither  model  when  propagating  a  set  of  changes  through  die  network.  This  flex-ibilits 
is  useful  during  initialization  of  a  network,  when  performance  information  is  not  a  major  concern. 

Simulator  performance  is  always  an  important  issue,  one  that  has  been  addressed  throughout  the 
thesis.  Chapter  4  describes  several  techniques  for  speeding  up  the  RS1M  algorithm;  using  a  compressed 
representation  of  logic  gates  and  caching  subnetwork  calculations  decreases  the  execution  time  of  RSI.M 
by  a  factor  of  two  or  more.  Ihe  local  switch  algorithm  presented  in  Chapter  5  is  ideal  for 
implementation  on  parallel  architectures,  l.ikc  many  relaxation  algorithms,  it  can  effectively  utili/e 
many  processors,  and  so  holds  the  promise  of  large  performance  improvements  in  simulation  when 
parallel  processors  move  out  of  the  experimental  stage.  A  different  approach  for  improving  the 
performance  of  switch-level  simulauon  is  described  in  Chapter  6,  which  proposes  performing  the 
network  analysis  once,  before  simulation,  and  using  the  results  to  compile  a  set  of  logic  equations  for 
each  node.  When  evaluated  in  the  proper  order  by  a  conventional  computer,  the  resulting  switch-level 
.imulation  is  many  umes  faster  than  simulation  using  traditional  techniques.  The  node  equations  can 
also  be  used  to  develop  instruction  sequences  for  special-purpose  simulation  hardware  —  c.g..  the 
Yorktown  Simulation  Knginc.  or  the  Zycad  multi-processor  —  extending  the  benefits  of  high-speed 
gate  evaluation  to  arbitrary  mos  networks  |Unr/ilai83). 

The  remainder  of  this  chapter  discusses  other  work  in  the  area  of  simulation  related  to  the  topics 
of  concern  in  this  thesis.  These  topics  include: 

•  algorithms  for  fast  circuit  analysis;  circuit  analysis  using  simplified  models 

•  mixed-mode  simulation 

•  logic-level  simulation  using  prc-dctcnnincd  transition  delays 

•  models  for  estimating  circuit  performance 

•  other  switch-level  simulation  algorithms 

Kach  of  these  areas  is  discussed  below. 

The  most  detailed  and  accurate  network  simulation  is  provided  by  circuit  analysis  programs,  such 
as  ASIA!’  [Wecks73J  or  SI’ICT  [Nagct75J.  The  capacity  limitation  of  circuit  analysis  is  a  prime  motivation 
for  the  development  of  simpler  simulation  models;  recent  improvements  in  circuit  analysis  algorithms 
arc  making  inroads  into  the  traditional  performance  problems  of  circuit  analysis.  Device  models  arc 
the  heart  of  a  circuit  analysis  program.  Ihe  models  arc  usually  analytic:  they  contain  formulas  that 
predict  device  performance  from  information  about  voltage  histories,  physical  properties  of  materials. 
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etc.  In  a  circuit,  the  behavior  of  a  particular  device  might  be  determined  by  several  electrical  nodes 
which,  in  turn,  arc  affected  by  other  devices;  Le.,  a  system  of  circuit  equations  is  needed  to  describe 
the  behavior  of  the  circuit  as  a  whole.  To  make  finding  a  solution  computationally  feasible,  most 
circuit  analysis  programs  proceed  in  two  steps: 

(1)  The  circuit  is  partitioned  so  that,  at  a  particular  time  step,  the  change  in  voltage 
on  each  node  is  approximated  as  a  linear  function  of  tire  node  voltages  (and  their 
derivatives).  It  is  during  this  step  that  device  models  must  be  evaluated. 

(2)  Solving  the  resulting  set  of  equations  numerically  (see  [SV80J). 

These  two  steps  can  be  quite  time  consuming,  although  for  large  circuits  the  second  step  becomes  the 
dominant  factor  [Ncwton80],  RSIM  reduces  both  costs  by  using  a  very  simple  device  model  whose 
effects  can  be  predicted  without  the  need  for  expensive  numerical  techniques. 

The  cost  of  model  evaluation  can  be  reduced  by  replacing  the  analytic  device  models  with  tables 
relating  device  current  to  terminal  voltages  [Chawla75,  Fan77J.  These  tables  can  be  derived  from  a 
one-time  evaluation  of  the  original  analytic  models,  or  filled  directly  from  device  measurements.  In 
these  simulators,  the  current  charging/discharging  of  each  node  capacitor  is  determined  from  the 
present  node  voltages;  thus,  the  change  in  node  voltage  for  each  time  step  can  be  calculated  directly 
and  the  cost  of  solving  a  set  of  simultaneous  equations  is  avoided.  Another  approach  to  reducing  the 
cost  associated  with  dealing  with  large  matrices  of  equation  coefficients  uses  a  relaxation  technique 
[Lelarasmee81,  Newton83]  to  successively  approximate  the  voltage  waveform  for  each  node  in  the 
circuit.  The  solution  for  each  node  is  computed  separately,  using  the  estimates  of  other  node  voltages 
computed  during  earlier  iterations.  Again,  this  avoids  the  cost  of  solving  a  large  set  of  simultaneous 
equations.  It  is  also  possible  to  skip  the  recalculation  of  a  node’s  waveform  during  a  particular 
iteration  if  it  can  be  determined  that  the  estimates  for  the  surrounding  network  have  not  changed 
substantially  since  the  last  iteration  (ie.,  selective-trace  comes  to  circuit  analysis).  These  techniques  can 
speed  up  circuit  analysis  by  an  order  of  magnitude  or  more,  but  the  programs  are  still  limited  to 
circuits  with  a  few  thousand  components. 

Recent  work  on  simulators  has  tried  to  combine  the  computational  advantages  of  gate-level 
simulation  with  the  precision  afforded  by  circuit  analysis;  this  has  lead  to  a  new  family  of  mixed-mode 
simulators;  [Chen78,  Gardncr79,  Hill79,  Agrawal80,  Ncwton80].  The  designer  can  specify  gate-level  or 
functional  simulation  for  simple  or  previously-verified  pieces  of  the  circuit,  reserving  the  expense  of 
circuit  analysis  for  critical  sections  of  the  design.  There  arc  two  problems  that  remain  to  be  solved  in 
mixed-mode  simulators:  conversion  between  the  different  representations  of  node  values  used  by  the 
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difTcrcnt  levels.  and  the  relate  J  problem  of  choosing  which  type  of  simulation  should  be  used  for  each 
subcircuit.  The  designer  car.  introduce  errors  into  the  simulation  by  an  unfortunate  choice  of  level  at  a 
critical  point  in  the  circuit;  special  care  must  be  exercised  to  avoid  discontinuities  and  other  pitfalls  of 
the  numerical  solution  techniques,  l.ike  circuit  analysis,  mixed-mode  analysis  still  requires  the  touch  of 
an  expert  lest  it  produce  misleading  results. 

Clearly,  it  is  only  a  matter  of  time  before  mixed-mode  simulation  becomes  true  hierarchical 
simulation  in  which  the  results  of  detailed  low-level  simulation  are  automatically  summarized  for  use  in 
subsequent  high-level  simulations.  A  hierarchical  system  would  also  decide  what  level  of  simulation  is 
appropriate  for  each  subcircuit.  Viewed  in  this  light.  RS1M  can  be  thought  of  as  the  first  step  toward 
automatic  identification  of  critical  subcircuits.  With  a  foot  in  both  worlds.  RS1M  provides  an  easy  path 
for  descending  into  circuit  analysis  or  for  abstracting  toward  higher-level  logic  functions. 

Another  approach  to  uming  simulation  that  retains  the  speed  advantages  of  gate-level  simulation 
is  determining  the  transition  delays  for  each  node  before  simulation  begins.  Some  gate-level 
simulators  [S/ygcnda72.  Casc78]  allow  the  user  to  assign  node  delays.  ITtis  type  of  simulator  can  be 
extended  to  handle  MOS  networks,  after  a  fashion  |Shcrwood81.  McDcrmott82],  The  result  is  a  system 
that  can  quickly  calculate  estimates  for  signal  delays  in  a  network.  Unfortunately,  the  delays  arc  not 
calculated  automatically  (and  hence  arc  prone  to  error  or  wishful  thinking  on  the  part  of  the  designer), 
and  are  approximate  at  best  for  pass  transistor  circuits  so  common  in  MOS  circuits.  A  more  effective 
technique  for  pre-computing  delays  is  the  use  of  the  results  of  actual  measurements  or  circuit  analysis 
runs  [Pilling73.  Nahm80J.  Ihc  delays  arc  measurcd/calculatcd  for  "standard"  gate  configurations,  and 
the  results  used  to  estimate  the  performance  for  the  actual  configuration  of  each  node  in  the  network. 
[Nahm80]  mentions  several  shortcomings  of  this  approach.  Circuits  with  multiple  inputs  arc  difficult  to 
analyze  since  a  particular  input  transition  is  chosen  when  performing  the  analysis:  also,  the  effect  of 
overlapping  input  transitions,  the  slope  of  the  input  waveform,  and  dynamic  changes  in  the  output 
oad  arc  not  considered.  (Interestingly,  all  these  problems  arc  solved  in  a  straightforward  way  by  RS1M. 
at  no  great  loss  in  execution  speed,  as  evidenced  by  the  performance  figures  quoted  by  Nahm.) 
[Okasaki83]  suggests  overcoming  these  problems  by  expanding  the  set  of  "standard"  configurations  to 
include  most  of  those  commonly  found  in  MOS  circuits  (complex  and/or  gates,  pass  gates,  etc.).  Ihc 
price  for  the  increase  in  accuracy  is  a  corresponding  increase  in  the  complexity  of  die  model  for  each 
gate;  his  simulator  spends  a  fair  amount  of  time  determining  which  prc-computcd  delay  should  be 
used,  given  die  current  configuration  of  the  network.  In  summary,  die  performance  variations 


introduced  by  non-standard  circuit  configurations,  and  changes  in  the  network  due  to  changing  node 
values  seem  to  offset  any  advantages  offered  by  pre-determined  transition  delays. 

Not  much  has  been  published  about  models  that  arc  suitable  for  quickly  determining  the 
transition  times  for  particular  network  configurations.  A  switched  linear  Thevenin  model  is  described 
in  [Glasscr80];  a  simulator  based  in  part  on  this  model  is  described  in  rfamuru83].  Multiple  resistances 
arc  used  to  describe  each  transistor;  conceptually,  the  appropriate  resistance  is  selected  by  a  rotary 
switch  controlled  by  the  transistor’s  gate  voltage.  Hach  resistance  is  chosen  to  model  the  actual 
channel  resistance  in  a  particular  region  of  device  operation.  The  linear  model  presented  in  this  thesis 
can  be  viewed  as  a  simplification  of  Glasser’s  model,  with  only  two  possible  switch  positions  selecting 
between  resistances  of  Rejj  and  °o.  A  simple  version  of  the  linear  model  also  appears  in 
[Oustcrhout83]  and  [Jouppi83J;  both  indicate  that  the  model  improvements  suggested  in  Chapter  3  arc 
needed  in  order  to  improve  prediction  accuracy.  [Horowitz83]  presents  a  simple  model  that  describes 
the  performance  of  a  network  of  pass  gates;  his  mode!  is  discussed  in  section  3.5. 

One  simulator  with  many  of  the  same  aspirations  as  the  switch-level  simulators  described  in 
Chapter  5  is  MOSSI.Vl,  written  by  Randy  Bryant  [Bryant81],  MOSSIM  uses  a  switch  transistor  model 
similar  to  that  presented  here,  but  its  calculations  are  organized  differently  since  (1)  node  values  are 
represented  using  a  cross-product  value  set  and  (2)  the  analysis  is  based  on  a  static  decomposition  of 
the  network.  A  major  difference  in  the  simulation  calculation  comes  in  the  handling  of  X  values  and 
their  effect  on  the  surrounding  network.  Bryant  handles  such  values  in  a  separate  stage  of  the 
computation,  using  global  knowledge  of  the  network  configuration  to  resolve  values  of  subnetworks 
connected  by  X  transistors.  (Other  differences  between  the  two  approaches  arc  discussed  in  Chapters 
2  and  5.)  The  extra  complexity  of  his  algorithms  results  in  some  degradation  in  simulator  performance 
over  that  achieved  by  the  simulators  described  here. 
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Proof  of  Lemma  5.3 


limma  5.3.  l.ct  IV.  X.  and  )'  be  network  states.  If  If'  X  and  \V  -*i  )’,  then 
there  exists  a  network  state  Z  such  that  X  -*  Z  and  Y  -»  Z . 

Recalling  how  the  update  operation  works,  it  is  not  hard  to  believe  that  the  l  emma  is  true.  The  value 
of  a  node  indicates  the  resistance  of  paths  from  the  node  to  vdd  and  CAD.  An  update  exchanges  path 
information  across  a  switch,  and  the  U  operation  ensures  that  information  is  never  lost  (the  indicated 
resistance  to  an  input  can  never  increase).  Intuitively,  an  update  only  adds  information  about  possible 
paths  to  the  network  state,  so  no  matter  what  switch  is  chosen  for  an  update,  one  can  also  go  back  to 
other  switches  latter  on. 

The  proof  is  straightforward,  demonstrating  how  a  state  /.  can  be  constructed  for  each  possible  X 
and  X.  The  proof  depends  on  some  simple  properties  of  the  U  operation  and  the  switch  function: 

A  U  A  =  A 
A  U  B  =  B  U  A 

a  U  swiich(a.  a)  =  a  (Al  l) 

switch  (a.  switch  (a.  a))  =  switch  (a,  a) 

switch  (a.  a  U  /))  =  swilch(o.  a)  U  swiich(o.  fi) 


which  can  be  verified  directly  from  the  definition  of  U  and  equation  5.9. 

If  the  two  updates  leading  to  suites  X  and  V  involve  only  one  switch,  X  =  V  and  the  1  emma  is 
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trivially  true.  If  two  separate  switches  arc  involved.  tliere  are  three  eases  to  consider  winch  differ  in 
the  number  of  nodes  affected. 


(a)  Case  1  (b)  Case  2  (cl  Case  3 

Figure  A  1.1.  Three  coses  in  proof  of  l  emma  5.3 


For  notational  convenience,  define  the  functions  f  and  g  to  describe  the  effects  of  switch  1  and  2 
respectively: 

f  (a)= switch  (a \ ,  a) 

g{a)=switch(o2 ,  a)  ( A 1 .2) 

Fach  of  the  two  updates  is  labeled  by  the  switch  it  operates  on:  for  example.  .Sj  refers  to  an  update 
involving  switch  1.  A  sequence  of  updates  is  written  as  S,Sj.  which  is  taken  to  mean  update  S,. 
followed  by  update  Sj. 

Case  1:  no  nodes  in  common.  As  the  following  diagram  indicates,  when  the  updates  have  no  nodes  in 
common,  they  result  in  the  same  state  when  applied  in  cither  order. 


Figure  A  1.2.  State  diagram  when  no  nodes  in  common 
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1'his  is  shown  by  considering  the  values  for  nodes  A.  B.  C.  and  I)  after  each  update: 


sequence 

1 

> 

J 

B 

C 

1) 

A. 

a  u  urn 

B  U  RA) 

c 

1) 

a  u  nm 

H  U  RA) 

C  U  f(l)l 

n  Dfin 

S-, 

A  ' 

B 

CUg(D) 

l)Ug(C) 

SiS, 

A  U  RB) 

B  U  RA) 

C  U  g(D) 

1)  U  g(C) 

Hie  final  states  of  the  two  sequences  arc  the  same,  demonstrating  die  desired  network  state, 

Case  2:  one  node  in  common.  As  the  following  diagram  indicates,  when  the  updates  have  one  node  in 
common.  S  \ S iS  \  is  equivalent  to  S 2 S  \S 2- 


Figure  A1.3.  Slate  diagram  when  one  node  in  common 
This  is  shown  by  considering  the  values  for  nodes  A.  B.  and  C  after  each  update: 


sequence  j  A 


S,S, 


S,S?S1 


s. 


s,s. 


A  U  RB) _ 

A  U  RB) _ 

a  u  am  u  rb  uluTu^tcn 

A _ _ 

auiihu  g(cn _ 

AUltDU  g<C)l 


_ B _ 

B~U  R A)  ~ 

HU  RA)U  gtC) _ 

n  u  n  \  )  u  gin  u  ha  u  rb» 

BUg(C) 

BUg(C)URA) _ 

n  u  g(C)  u  ftA)  u  gtc  u  gum 


_________  c _ 

~c 

C  U  g(B  U  HAD _ 

{  C  U~g(B  U  fl A))  ~ 

C  U  g<  H) 

|  CU~g(B) 

]  C  U  g(B)  U  g(B  U  g(0  U  HA)) 


Using  the  identities  in  equation  Al.l,  the  final  values  of  A.  B,  and  C  for  each  sequence  can  be 
simplified  to 

A  final  =  A  U  ./(//)  U  /(g(C)) 

ft  final  =  ftUs(C)Uf(A)  (A1.3) 

c final  =  (  Ug(ft)U  g(f  (A  )) 

The  final  states  of  the  two  sequences  are  the  same,  demonstrating  the  desired  network  state,  Z. 


Case  3:  two  nodes  in  common.  As  in  Case  1.  when  the  updates  have  no  nodes  in  common,  they  result 
in  die  same  slate  when  applied  111  either  order,  iliis  is  shown  by  considering  the  values  for  nodes  A 


i 


and  B  after  each  update: 


i  sequence  A 

'  S.  |  A  U  fill) 

11  U  ftAI 

[_SXS,  1  A  U  li III  U  fill  U  ftA»_ 

|  11  U  ft  A 1  U  g(  A  U  (111)) 

|S,  1  A  U  gtlll 

j  11  U  g(A) 

1  S,S,  I  AUftllUgl.Al) 

!  11  U  g(A)  U  ft  A  U  g(B> 

Again,  using  the  identities  in  equation  Al.l.  the  final  values  of  A  and  B  for  each  sequence  can  be 
simplified  to 


A  final  =  ^U/(i)Ug(i) 
Bfinal  =  B  U  g(A  )  U  f(A  ) 


The  final  states  of  the  two  sequences  arc  the  same,  demonstrating  the  desired  network  state.  Z.  I 
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APPENDIX  TWO 


RSIM  Calibration  Tables  for  a  5/x  nMOS  Process 


rsim's  transistor  model  relies  in  part  on  three  modeling  resistances  for  each  transistor  in  the 
network: 

Rs taw  for  calculating 

Rdynlo*  calculating  the  transition  time  for  high-to-low  transitions,  and 

Rjynhigh  for  calculating  the  transition  time  for  low-to-high  transitions. 

These  resistances  are  chosen  for  each  transistor  on  the  basis  of  its  geometry,  type,  and  usage  in  the 
circuit.  The  static  resistance  is  chosen  to  obtain  a  good  prediction  for  the  0-output  voltage  of  a  logic 
gate.  Actually  this  constrains  only  the  ratio  of  the  n-channcl  and  pullup  static  resistances,  so  there  is 
considerable  freedom  in  choosing  these  values. 

lTie  dynamic  resistances  for  each  transistor  type  are  specified  in  the  following  diagram.  Because 
of  their  special  nature,  depletion  devices  configured  as  pullups  arc  treated  separately  from  other 
depletion  devices. 


r 
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transistor  type 

^dynlow 

1 

^dynhigh 

1  '■  -  -  -  - 

n-channe! 

lables  A2.1  &  A2.2 

lahle  A2J 

depletion 

(sec  text) 

I  able  A2.4 

pullup 

1 

lahle  A2.5 

The  tables  appear  at  the  end  of  this  appendix.  R&nlo*  is  not  needed  for  a  ptillup.  but  might  be 
needed  for  other  configurations  of  depletion  deuces  (c.g..  if  one  appeared  in  a  pulldown  path).  If 
desired,  a  very  high  can  be  specified  for  depletion  deuces  to  flag  the  use  of  a  depletion  device 

in  a  pulldown  path. 

The  tables  below  were  prepared  by  analyzing  the  simple  si’lCT  experiments  proposed  in  section 
2.4.  As  mentioned  in  that  section,  more  sophisticated  experiments  might  be  more  appropriate  for 
designers  who  wish  to  push  RSIM  to  its  limits.  ITicse  tables  arc  used  by  examples  in  the  thesis:  for 
actual  simulation,  some  of  the  values  should  be  derated  (increasing  the  resistance)  to  ensure 
conservative  estimates. 

The  experiments  were  run  using  version  2G.5  of  SPICt  with  the  following  device  models  (a 
typical  5/i  nMOS  process): 

MODEL  ENH  NMOS  ( LEVEL  =  2  VT0  =  1.0  PHI  =  0  55  GAMMA  =  0.4  CGS0  =  4  5E-10  PB  =  0 .85 
JS  =  JE-  18  CJ=  7 . 2E  -5  CJSW  =  3.6E-10  TOX  =  I E  -  7  NSUB=1.0E15  XJ=lf-6  tD  =  0.7E-6 
U0  =  690  UCRH  =  1E5  UEXP  =  0  .  12  MJ  =  0 .5  MJSW  =  0.27) 

MOOEL  OEP  NMOS  (LEVEL  =  2  VTO  =  -3.3  PHI=0.55  GAMMA=0.47  CGS0  =  4.5E-!0  PB  =  0.85 
JS  =  IE  - 18  CJ  =  7.2E-5  CJSW  =  3.6E-10  TOX  =  1 E  -  7  NSUB=1.0E15  XJ  =  lt-6  LD  =  07E-6 
UO  =  690  UCRIT  =  1E5UEXP  =  0.12MJ  =  0.5  MJSW  =  0 .27) 

Rise  time  is  measured  as  the  length  of  lime  needed  for  an  output  to  rise  from  0  \o1ls  to  2.1.14  volts  — 
the  balance  point  of  a  4:1  inverter  built  using  this  process.  (Section  .0.1  explains  why  the  balance 
point  is  chosen  as  the  threshold.)  Fall  lime  is  the  length  of  time  needed  for  an  output  to  fall  from  5 
volts  to  the  threshold. 

Note  th.it  widths  and  lengths  arc  shown  in  microns,  and  the  table  values  arc  in  units  of  Kfl  per 
square  of  channel:  one  must  multiply  the  appropriate  table  entry  by  the  number  of  squares  of  channel 
(lcngth-5-widlh)  to  get  a  transistor's  resistance.  For  table  entries  marked  no  value  is  available 
because  of  a  SPICF  bug. 
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Width 


1  ength 

^ . J 

5 

10 

20 

10 

40 

50 

100 

5 

8.7 

*  11.6 

16.2 

17.1 

17.5 

!  17.8 

18.4 

H) 

8.8 

11.7 

16.2 

17.1 

17.6 

|  17.8 

18.5 

20 

8.8 

11.8 

16.1 

17.1  ! 

17.8 

1  18.0 

18.9 

.10 

9.0 

11.8 

16.5 

17.4  ; 

17.9 

j  18.2 

lOdT 

40 

9.6 

1  14.0 

16.6 

17.6 

18.1 

'  18.5 
j  18/7 

19.6 

50 

:  10.0 

14.0  16.8 

17.7  : 

18.3 

20.61 

100 

10.0 

15.0 

17.0 

18.7 

19.1 

|  19.8  ' 

21.9  : 

Tabic  V 2 . 1 .  Channel  resistance  (KQ. AH)  far  n-channcl  pulldowns 


Length 


'enh-thresh 

!  5 

;  10 

20 

;  30 

'  40  7  50 

100 

5 

16.0 

;  26  1 

11.5 

33.3 

34.1 

34.6 

35.6 

10 

16.6 

26.9 

32.1 

33.7 

34.6 

35.0 

35.9 

20 

17.6 

'  28.0  ! 

32.9 

'  34.4 

35.1 

35.5 

35.5 

Width  30 

18.6 

'  28.8  I 

33.5 

34.8 

4  35  4  _ 

35.8 

36.4 

40 

19.2 

29.6  ! 

33.8 

35.1 

35.7 

36.0 

36.6 

'  50 

:  20.0 

1  10.0  i 

34.3 

35.3 

35.9  J  36.2 

36.8 

i  i  100 

22.0 

;  31.0  j 

35.5 

;  36.3 

36.8 

37.0 

37.6  j 

Table  \2.2.  Channel  resistance  (KSIA3)  far  n-channcl  pulldowns  with  threshold  drops 


j  Length 

“enh-sf  5  10  i  20  !  30  i  40  |  50 

100 

,  5  ;  12.6  22.8  28.8  31.2  !  32.5  |  33.5 

36.7 

i  10  .  12.8  ]  23.1  i  29.5  32.2  l  34.0  i  35.4 

40.5 

20  12.8  23.6  I  30.8  j  34  3  1"  26.9  !  39.0 

48.1 

W'idth  i  30  i  13.2  24.3  t  32.1  1  36.5  39.8  42.7 

55.7 

1  40  13.6  24.8  33.6  P  38.5  42?T^  46.4 

63.3 

50  ;  14.0  ;  25.5  |  35.0  i  40.7  |  45.6  50.1 

70.9 

i  100  14.0  .  28.0  |  41.5  |  51.3  \  60.3  |  68.6 

- 

Table  A2.3.  Channel  resistance  ( KU/Oj  far  n-channcl  source-followers 


I 


Table  A2.5.  Channel  resistance  (KQ/U)  fur  depletion  pullups 
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APPHND1X  THRHK 


Approximation  for  Resistor  Divider  and  Series  Resistor 


As  part  of  the  incremental  computation  for  die  Thevenin  equivalent  of  a  network,  it  is  necessary 
to  approximate  a  resistor  divider  and  series  resistance  (figure  A3. 1(a))  by  a  simple  resistor  divider 
(figure  A3. 1(b)). 


Figure  A3.1.  fniiial  resistor  network  and  desired  approximation 


As  usual,  each  resistance  is  potentially  a  resistance  interval.  An  exact  choice  for  the  modeling 
resistance  is  impossible  (as  w  ill  be  shown  below)  so  the  go.il  of  this  appendix  is  the  choice  a  suitable 
approximation. 

Consider  a  resistor  divider  with  pulltip  resistance  P  and  pulldown  resistance  Q. 
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(a)  rcsislor  divider  <b)  Thevenin  equivalent 

Figure  A3.2.  Resistor  divider  and  Thevenin  equivalent 

The  parameters  of  the  Thevenin  equivalent  are 

Tf/icv  =  fT+Q  an<^  Rthev  —  P  \  \  Q  (A3. 1) 

which  can  be  rearranged  as  linear  equations  relating  R,hev  and  Vthev  •' 

Rihev  =  P  i  f/itv  and  Rthe\  —  Q  0  —  f'r/iev )  (A3. 2) 

If  P  and  Q  are  intervals  —  P  =  [/*/,  /*/,]  and  Q  =  ( Qi ,  (?a!  —  then  the  Thevenin  parameters  also  are 
intervals: 

Vthev  =  l  tttV-  7r%TT  1  and  *«*«  =  l  Pi  1 1  Ql.  Ph  1 1  Qh  1  (A3.3) 

Qi  +  Ph  Qh  +  Pt 

If  one  plots  the  Thevenin  parameter  values  (R,hcv  vs.  V,hCv )■  as  P  and  Q  arc  varied  independently 
over  their  respective  intervals,  equation  A3. 3  suggests  the  resulting  area  would  be  rectangular,  but  this 
is  not  the  case,  as  is  illustrated  by  the  following  figures. 


(a)  P,  0  constant  (b)  P  constant,  0  varying  (c)  P  varying  Q  constant 

Figure  A3.3.  Thevenin  plots  as  P  and  Q  arc  varied  one  at  a  time 

liquation  A3.2  tells  us  that  if,  say,  Q  is  held  constant  and  P  is  varied,  the  plot  is  a  straight  line  of  slope 


-  is:  - 


Q.  which,  if  extended,  would  intersect  the  R,illt  avis  .n  I  -  1  (see  figure  \Vl(d).  When  both  /’ 
and  Q  arc  varied  (see  figure  \.V4).  the  plot  produces  a  diamond-shaped  quadrilateral,  and  not  a 
rectangle. 


Rlhcv 


Figure  Y3.4.  Thevenin  plot  as  P  and  Q  are  varied  simultaneous!) 


Although  the  limits  of  and  1',/iov  arc  the  ones  shown  in  equation  A3.3,  certain  combinations  of 
Thevenin  parameters  pennitted  by  Uic  equation  arc  clearly  ruled  out  by  the  diagram  above. 

If  a  scries  resistance  R  is  now  added,  the  resulting  Thevenin  plot  is  shown  in  the  following 
figure. 


Rthev 


F'igurc  A3.5.  Thevenin  plot  when  series  resistance  is  added 


The  result  is  not  a  plot  of  a  resistor  divider  at  all.  In  order  to  approximate  the  circuit  by  a  divider,  a 
decision  is  needed  concerning  which  information  to  preserve  with  the  approximation. 

Since  the  approximation  under  development  is  used  to  calculate  !  .  it  is  important  to  preserve 
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information  about  the  maximum  and  minimum  of  the  circuit’s  voltage.  This  constraint  fixes  the  right 
and  left  vertices  of  die  diamond.  Ihc  top  and  bottom  vertices  are  constrained  by  the  choice  of 
resistance  information  to  preserve;  since  it  is  better  to  overestimate  titan  to  underestimate  resistances, 
the  minimum  value  of  R,)uv  is  preserved.  ITic  resulting  divider  is  shown  graphically  in  the  following 
figure.  The  voltage  constraints  arc  shown  as  dashed  vertical  lines;  the  resistance  constraint  as  the 
circled  vertex. 


Figure  A3.6.  Thevenin  plot  showing  approximating  divider 
The  values  for  A /  and  5/  are  determined  by  the  second  constraint  and  equation  A3.2; 

Ql  - 1  n  ,  t r\  t  I  n  \  _  n./i  ^ 


Rl  +  (Ql  I  1  Pi)  = 


Pi  +  Q, 


and  Ri  +  (Qi  |  |  Pi)  =  5/(1  - 


Pi  +  Ql 


)  (A3.4) 


This  fixes  the  two  lines  that  form  the  bottom  half  of  the  diamond.  Next,  the  values  of  Ah  and  Bh  are 
chosen  so  that  the  left  and  right  vertices  of  the  diamond  have  the  same  V/hev  coordinates  as  in  figure 
A3.5: 


- — —  =  -  -gL_  and  — *»—  =  Q-h  - 

Ah  +  B/  P/,  +  Qi  A/  +  Bh  Pi  +  Qh 

Solving  equations  A3.4  and  A3.5  for  the  parameters  of  the  approximating  divider  yields: 

A,  =  P,  +  R,  +  R,-%-  Ah  =  Ph  +  Ri ^  +  Ri%- 

Qi  Pi  Qi 

B,  =  Qi  +  R,  +  Ri~  Rh  =  Qh  +  Ri~^  +  Ri~ 

Pi  Qi  Pi 


(A  3.6) 
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Note  that  all  resistances  are  greater  than  the  minimum  resistance  of  the  series  resistor  ( Ri ).  A 
different  choice  of  what  resistance  information  to  preserve  (as  was  made  in  early  versions  of  RSIM), 
might  cause  At  and  B/  to  be  less  than  /?/.  leading  to  pessimistic  voltage  predictions  for  some  nsios 
circuits. 


V 
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